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SUMMARY 

This research work deals with the assessment of certain manipulative 
skills in school science practical work. The results were used to study the 
reliability and validity of assessment, , and also to investigate the effect of 
teaching on pupils' ability to retain the knowledge and reproduce the 
taught skills and even to extend the taught skills to a similar new 
situation. The work was carried out in 3 stages. 

In the first stage two skills were assessed, viz (1) measurement of volume 
by measuring cylinder, and (2) pouring out the liquid, involved in the 
reaction of Mg ribbon with dil. HC1. Approximately 75 fourth-form 
pupils: (14-15 years old), divided into 3 sets, carried out the experiments in 
three successive sessions. In each session 9 judges, divided into 3 panels, 
carried out the assessment, in the laboratory, of pupils' ability to perform 
the techniques involves in the experiment (referred to as Method -T). 
Judges in each of 2 panels assessed one of the two skills while those in the 
third panel assessed both the skills for each pupil in the set. Arrangement 
was such that at the end of 3 sessions, each panel of judges assessed both 
the skills simultaneously in one session and individually in 2 sessions. 
, The final result, i.e. time for Mg to dissolve completely, recorded and 
reported by each pupil, was checked against a standard result to carry out 
ah asssessment based on the final outcome (product) of the extended task 
involving the skill (referred to as Method -P). 

Both the plot of data and calculation of Pearson-r indicate that no 
significant correlation exists between the achievement of pupils, obtained 
from two methods of assessment, for carrying out either of the skills in 



any of the three sessions. These two aspects, (i) comparison of pupils' 
achievement based on the two methods of assessment, and (ii) calculation 
of Pearson -r, have been considered as- the measures for validity of • 
assessment. 

To study the reliability of assessment, two criteria were considered:- (1) 
Comparison between the assessment of either of the skills by a panel of 
judges engaged in assessing only one skill with that by another panel 
engaged in assessing both the skills simultaneously, and (2) total number 
of agreements/ disagreements among the 3 judges in each panel for 
assessing a skill. : . : ." 

Results show that the mean value of /achievement for; carrying: out a skill, 
remains more or less similar, irrespective of the change in the mode of 
assessment. The mean value of achievement for the pouring out skill is 
much higher than the volume measurement skill. Poor performance (~ 
33% marks) for the volume measurement skill may be the reason for low 
achievement of pupils assessed by Method -P. , - ., . 

Agreement among the judges is almost 100% for assessment of the 
pouring out skill, while it is about 50% for the volume measurement 
skill. Furthermore it remains unaffected whether the judges assessed one 
skill alone or 2 skills at the same time. 

Results of 2X3 ANOVA indicate that the variable, such as the methods of 
assessment, the number of panels of judges, and number of assessment 
sessions have no significant effect on the number of 
agreements/disagreements among the judges assessing each of the skills. 



The second' stage of the research concerns the studies on pupils' ability to 
retain and reproduce process skills taught earlier. AH the pupils who took 
part in the first stage of the research were also used as samples here. 
Pupils in each of the 3 sets were randomly divided into 3 groups: 

(1) Volume measurement group, who were taught the skill of measuring 
volume using a measuring cylinder, 

(2) Heating group, who were taught the skill of heating a solid sample 
using a Bunsen burner, and 

(3) Control group, who were shown a video film involving none of the 
skills mentioned earlier. 

Following the summer holidays, giving a 2 months break, a test session 
was organised. All the pupils carried out 2 experiments, one involving 
the volume measurement skill and the other using the heating skill. 
Following a check-list, 8 science teachers assessed the pupils' 
manipulative skills (assessment by Method -T). : 

The results show that in the volume measurement experiment the taught 
: group is twice as efficient as the untaught groups, but in the heating 
experiment the taught group is only slightly better than the untaught 
groups. 

Analysis of variance (3X3 ANOVA) shows that the results of the 
assessment are influenced by the variables (method gr. and session) at p < 
0.001 level for the volume measurement skill and also 2 skills (position of 
the test-tube holder and angle of the test tube) out of a total of 4 skills 
involved in the heating of solids experiment. 



In the 3rd and final stage of the investigation, studies have been made on 
pupils' ability to transfer a skill which has been taught some time earlier, 
to a new situation. 2 sets of 3rd form (13-14 years old) pupils from a local 
independent school took part in this study. For the teaching session, each 
set was randomly divided into 3 groups: 

(1) A Volume group who were taught volume measurement using a 
measuring cylinder, ;; 

(2) A Heating group who were taught heating solids using a Bunsen 
burner, and 

(3) A Control group watched a video film involving neither of these 
skills. 

In, a test session, after a further break of 6 weeks, all the pupils were asked 
to carry out 2 experiments which involved 2 skills, viz: (1) Volume 
measurement using a burette, and (2) heating liquid using a Bunsen 
burner. Assessment was carried out (by Method -T), following a check-list, 
by 4 science teachers. The density data, reported as the results of the 
volume measurement experiment, formed the basis for the assesment by 
Method -P, of pupils' performance of manipulative skill. 

Results of assessments by both the methods show that the taught group 
have performed better in both the experiments. Calculation of Pearson-r 
values indicates that there is no significant correlation between the pupils' 
achievement data obtained from these two methods of assessment. 
Validity of assessment carried out in this way has also been discussed. 
Studies of 3X2 ANOVA suggest that the variables such as method groups 
or sessions have no influence on pupils' achievement. 
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CHAPTER 1 
INTRODUCTION 



1.1 FOREWORD TO THE PRESENT STUDY 

As the recently introduced GCSE examination system includes marks 
for practical skills in science subjects, it has become more important 
than ever before to learn how to teach these skills effectively. 

Teaching of practical skills can be described as a success, provided the 
pupils can reproduce these skills accurately in similar circumstances, or 
use this knowledge to carry out similar operations in the future. 

Accurate assessment, like effective teaching, could be an equally difficult 
task in which fairness and reliability are important criteria. The 
outcome of an assessment not only provides information about the 
standard achieved by the pupil, it also creates scope for further teaching 
plans. 

Furthermore, assessment could be based either (i) on the observation 
and marking of the performance of the manipulative skills/techniques 
during the practical session (referred to as Method -T), or (ii) on the 
outcome of an extended task and marking of the final results /products 
submitted as a written report of the work (referred to as Method -P). As 
the execution of the manipulative skill is likely to affect the final result, 
assessment from either direction could be expected to be equally 
acceptable for grading the pupil, but it is perhaps worthwhile 
beforehand to compare data from both the sources and ascertain if a 
significant correlation exists between them. 



As the correct assessment of manipulative skills during a practical 
session is a difficult task, because a teacher may have to judge a large 
number of pupils in a short period of time, and he may have 
preconceived views about the pupil's ability. As a result, it has been 
thought to be worthwhile to check the reliability of such assessment 
among a number of judges by comparing the marks given by them for a 
pupil's ability to execute certain manipulative skills. 

1.2 Historical development of assessment in school science practical 

work 

-t 
The following is a summary of a paper by Lock in which he compiled 

in detail a history of the practical work in school science and its 
assessment. 

Practical work in school science in Britain started in the second half of 
the nineteenth century. In the early stages practical work filled a largely 
supportive role, i.e. confirming the theory which has already been 
taught. A similar verification approach also existed in the USA. 
It is towards the end of the nineteenth century, when Armstrong's 
heurism began to exert an influence on the nature of the practical 

work ' ' , that such a "discovery approach" to science had widespread 

effects on both sides of the Atlantic. 



Following the First World war, the Thompson Committee were critical 

of school practical work as a lot of time was devoted to laboratory work 
involving repetitive practical exercises. 



The 1944 Education Act had little direct impact on practical work, but it 
did make recommendations as to the number and size of science rooms 

that a school should have 6 . 

The debate over effectiveness of individual practical work as compared 
with demonstrations w£Tp.on from 1920s to the 1950s, particularly in 
the USA, and although the 1950s in America saw the planning of new 
courses, which heralded the return of discovery learning, controversy 

7 8 
over this has continued ' . 

Meanwhile in Britain, the late 1940s and 1950s saw a continuation of the 
confirmatory role of practical work — with demonstration moving 
especially up to the end of compulsory schooling (year 5). Study in the 
sixth form, featured more involvement of pupils/n practical work, but 
it remained corroborative of theory or biased towards developing a high 
proficiency in a restrictive range of techniques, such as titration and 
dissection. 

With practical work biased towards learning technique, it was not 
difficult to assess the skills in a three-hour practical examination. Such 
a situation persisted throughout the 1960s and into the 1980s. The 
influence of the Nuffield A-level Curriculum was, however, felt more 
broadly in the 1970s, with some A-level syllabuses changing the practical 
examination format and adopting internal teacher assessment of 
practical skills. 



There were no practical examinations or internal assessment of practical 
skills with- the Nuffield GCE O-level courses (although Nuffield O-level 

chemistry did provide internally assessed options) . 



The other significant event of the 1960s, with implications for school 
practical work, was the introduction of the Certificate of Secondary 
Education (CSE). The teachers had the freedom of designing their own 
syllabuses and adopting their own method of assessment. 

The extent to which continued assessment by internal means was 
employed in science syllabuses is well illustrated by Eggleston , and it 

is interesting to note that the following years saw a decline, at CSE level, 

11 
of the compulsory practical examinations . The other method of 

assessment that gained in popularity at CSE level was the project. This 
was clearly an adoption of ideas introduced by some of the Nuffield 
Advanced level schemes. 

An event in the late 1970s with potential implications for practical work 
in schools and its assessment, was the speech of the then Prime 

Minister, James Callaghan, at Ruskin College in Oxford. This led to the 

i 
establishment of APU (Assessment Performance Unit) whose 

1 7 
framework included the assessment of practical skills . 

In. 1982, the Oxford Certificate of Educational Achievement (OCEA) 
announced provisions for graded tests which would incorporate 

assessment of practical skills . 



When the national criteria for GCSE (General Certificate for Secondary 
Education) examinations in science were published in May 1985, they 

made it clear that : "All schemes of assessment must allocate not less 

than 20% of the total marks to experimental and theoretical skills". 

In summarising/ it can be said that the relatively recent introduction of 
teacher-assessed techniques was influenced by Nuffield A-level courses 
and the CSE scheme, both of which involved the teacher in the role of 
assessor. 

Development of GCSE has created the need for the internal assessment 

of practical skills of all candidates entered for science examinations at 

the age of 16 

1.3 Review of some previous work on assessment of practical skills in 

school science. 

Due to the growing important of science in the school curriculum and 

the increasing emphasis given in the assessment of practical work, the 

'''■"■' i ^ 
laboratory in some instances has been described as - "...not just a place 

for demonstration and confirmation, but rather as the core of the 
Science learning process". / 

Until the late 1970s, in most cognitive assessment activity in science, 

importance was given to the recall of knowledge. At this time, Bloom's 

16 
"Cognitive" taxonomy was adopted as a basis for the assessment 

objective. However, in the 1980s, more importance has started to be 
given to the assessment of intellectual (cognitive) and practical 



(psychomotor) skills and abilities. 

Abolition of the norm-referenced 2 tier examination system, viz. GCE 
and CSE and introduction of the criterion-referenced GCSE in their 
place, created a new situation in science education in schools. In this 
examination system, there is a compulsory element of teacher-assessed 
practical work, carried out by the pupils throughout the 2-year schooling 
period. 

Some of the research work carried out in conjunction with various 

assessment projects (e.g. SAPA and NAEP in the USA, Nuffield Science 

5-13, TVEI and GCSE in the UK) has been reviewed recently by 

17 
Johnson . 

Holstein and Lunetta ' having reviewed a selection of past research 

work, suggested that the area of practical skills assessment has remained 
a neglected area. Questions ought to be raised (1) to know the nature of 
the work pupils are doing in the laboratory, and (2) to find an 
appropriate method of assessing their work. In this context the work of 

Olsen can be referred to when making a clear distinction between 

i 
development of knowledge (know-that) and development of skill 

(know-how). He stressed the fact that often skills are undervalued by 

using them as a means to gain knowledge, although acquisition of 

scientific skills is much more important than that. Similar stress was 

given to scientific skills by the DES/ APU project 20 , by Bryce et. al , who 

suggested the presence of evidence for the fact that pupils who may not 
do well in traditional assessment, giving emphasis on recall of 



knowledge, may demonstrate better performance in carrying out work 
involving scientific skills.- 

Regarding the in-school assessment of pupils' practical skills, the work 

of the TAPS group is very familiar to many teachers. The TAPS 

project has evaluated techniques for the assessment of pupils' practical 
skills and produced. guidance materials for teachers. The TAPS 
assessment pack for teachers has been produced to cover a large number 
of skills, viz. measuring, observing, recording, drawing inferences and 
problem solving etc. 

However, a contradictory view is proposed by Miller regarding the 

school science practical work. It has been suggested that the purpose of 
the experiment in a science laboratory will be to develop knowledge and 
understanding through interpretation, which is closer to the real 
science. This is very different from the traditional view of teaching ^ 

various skills as has been advocated by the Nuffield reports and then 

echoed by the DES reports 29 . The Nuffield report states that "pupils 

acquire the feeling of doing science, of being ...." a scientist for the day" '. 
According to the DES reports, "the essential; characteristictic of education 
in science is that it introduces pupils to the methods of science". 

Recently Woolnough and Ton , in reporting their work on assessment 

of practical work in the school laboratory, rejected the idea of assessing 
individual skills, i.e. "scientific processes" involved in various steps of 
the work. Instead they advocated the use of a "holistic approach", i.e. 



il 



the assessment of the final outcome of the whole experiment. 



From this it appears that they have ignored the fact that science is a 
practically based subject and how efficiently the manipulative skills in 
practical work are performed is important in order to achieve correct 
results submitted in the form of a written report. In an experiment, a 
number of manipulative skills are involved. All of them may be 
important, but some are likely to be more important than others as their 
impact on the final result of the experiment is concerned. Furthermore, 
all these skills involved in an experiment cannot be assessed in one 
session-because of various constraints. Performance of the practical 
skills and answering the check-list items may be complementary to each 
other, when a pupil is assessed for his/her practical work. However, 
either of these two criteria can be used as a basis for assessment, 
provided it is proved that there is a significant correlation between 
them. Woolnough and To h 30 have reported a strong correlation 
between two methods of assessment and they tried to show that the 
final report is a good substitute of assessment in the laboratory. Perhaps 
it would be worthwhile to carry out a further study and check if there is 
a strong correlation between the assessment, of the performance of skills 
at the work-place and assessment of the results reported on paper on the 
basis of their practical work. 

In dismissing the idea of assessing practical skills, Woolnough and 
Toh 30 have cited the work of Miller and Driver 31 , who share the view 
that the learning of an individual psychomotor skill by pupils in a 



laboratory is of no importance regarding their acquisition of scientific 
knowledge. 

Miller and Driver propose that "school science" is different from "real 

science", although they do not clearly define what the "real science" is 
or how it comes about. They failed to understand that various scientific 
skills learned by a pupil, and knowledge gained thereby in a school 
laboratory, enables him/her to perform a more complicated task in a 
scientific laboratory after leaving school. Furthermore these individual 
tasks taught and assessed in a school science laboratory are not seen as 
isolated process skills by the pupils. They aim to achieve a final 
objective for Which various tasks are to be performed and in the end : 
these activities would lead to a final result. 

In their paper Miller and Driver criticised the idea proposed by Bryce 
et. al, 6 who gave a considerable importance to the process skills. 



Bryce and Robertson replied to this criticism by saying - "Miller and 

Driver have taken restricted views of process in scientific, psychological 
and pedagogical terms and thus misrepresented the thinking behind the 

assessment materials developed by the TAPS research group . 



Pupils' understanding of science through reasoning and problem 
solving (discovery method) may be the final goal, but to achieve those, I 
feel that they should be taught and assessed on some basic scientific 
skills which will enable the pupils to carry out a more complicated task 
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in their future life (perhaps as a professional scientist). 

Although Miller and Driver - 1 tried to relegate process skills to 

pedagogic means rather than to educational ends, they hope ultimately 
that "a critical appreciation of the way scientists work" will empower 
pupils to demonstrate their skills, _which they consider desirable. 

Bryce and Robertson 32 argued at length to counteract the criticisms put 
forward by Miller and Driver 31 against the process skills in the school 

science laboratory. The TAPS materials developed by Bryce et. al. 

were not content-independent process skills. Pupils taught to follow the 
check-list of these skills will certainly have an opportunity to make 
inferences in scientific context at the end of the work. 

> 
Although Miller and Driver 31 tried to make a great distinction between 

"process" and knowledge, in reality the relationship between scientific 
skills (science know-how) and scientific knowledge (know-that) is much 

more subtle and intricate than Miller and Driver l tried to believe. 

i 
In "Science 5-16: A statement of policy" 29 , the Department of Education 

and Science tried to explain the need of science education for all the 
pupils. It suggests that knowledge of scientific methods will not only 
help to develop intellectual talent, but also offer opportunities for 
careful observation, measurement and planning in everyday life in the 
future. 
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In his paper, Gott discussed the nature of some of the skills involved 

in experimental work and how assessment of these skills can be carried 
out. He has mentioned 4 major skills, viz. (i) observation, (ii) 
measurement, (iii) explanation (i.e. interpretation), and (iv) open-ended 
investigation and problem solving. 

Apart from these, I think, pupils will also need to develop the skill of 
recording and presenting (in table/ graph) their observations and 
measurements of various aspects of an experiment. 

For the assessment of observation and measurement,; pupils can be , , 

provided with a check-list of all possible options. When a pupil records 

his/her observations or measurements, it may or may not be the truth 

but it is what the pupil considers to be happening during the 

> 
experiment. These results can be assessed against some standard results 

providing a basis for the assessment by Method -P as mentioned earlier. 

However, for on-the-spot assessment of the performance of the 

skills/techniques during practical work (assessment by Method -T) 

several experienced science teachers equipped with check-list, can be 

used as judges (assessors). 

The Bunsen burner is a simple, but versatile apparatus in the chemistry 
laboratory. A pupil usually starts using it as soon as he enters the 11 + 
science course and continues to do so throughout his school-life. 
Experiments involving heating by Bunsen burner are numerous in 

school science practical courses. In a study, Farmer analysed the RNOC 
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syllabus for practical work and identified the "key" manipulative tasks 
and cognitive tasks -which were playing important roles, not only in 
carrying out the laboratory work, but also in the complete 
understanding of the whole chemistry course. Some of the "key" 
manipulative skills frequently occurring in the experimental work are: 

(1) use and control of a Bunsen burner to heat a solid/liquid, 

(2) transfer of liquid (or powdered solid) by pouring, 

(3) manipulation of cork/bung + accessories, 

(4) operation of direct reading balance and, 

(5) electrical connections within electrolysis circuit etc. etc. 

It would be worthwhile to subdivide each task into various steps of 
manipulative skill and judge the pupils' achievement in each step to 
isolate the area(s) of weakness for which special care can be provided 
later. 

> 

Hubbard and Seddon° investigated the performance of pupils (12-14 

years old) in heating solids in a test-tube using a Bunsen burner. 
Attempts were made to assess the abilities of pupils in carrying out the 3 
manipulative skills involved. It has been found that almost all the 
pupils are good in placing the test tube holder at the correct position. 
However, a large percentage are unable to choose the correct colour of 
the flame and also to hold the tube at the correct angle. No significant 
difference in achievement was observed between the sexes and also 
amongst the 3 year groups (2nd, 3rd and 4th year). 



Seddon and Barry also used experiments involving the heating of 
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solids using the Bunsen burner as a part of their investigation. In this 
study, pupils' ability was judged for carrying out 3 manipulative skills, 

as mentioned in an earlier work by Hubbard and Seddon , in addition 

to recording of a number of observations. However, the main objective 
of this work was to compare the achievement of 4 groups of pupils 
carrying out the same experiment by following 4 different modes of 
instruction, viz. (1) work-sheet, (2) work-sheet + static diagram, (3) 
work-sheet + static diagram + audio-tape, and (4) work-sheet + static 
diagram + video tape with pictures of demonstrations. 

It has been found that the pupils who followed only the work-sheet 
produced the best results. These, results have been explained on the ' r 
basis. of the idea that a pupil working with only written instructions can 
think independently and create his/her own mental imagery needed to 
plan the practical work. On the other hand when additional materials 
were provided in the form of another mode of instructions, the pupil's 
thought has been influenced by the outside idea which may interfere 
with the independent thinking ability and impede planning and 
progress. Furthermore with the video film, where lifesize moving 
pictures of a demonstration were provided, this may have given the 
pupils a sense of false security, and discouraged them from thinking 
independently to create a working condition on the bench. 

In addition to heating solids, experiments involving the measurement 
of volume by using a measuring cylinder were also used by Seddon and 

Barry to confirm their findings regarding the most efficient mode of 

instruction in a chemistry practical. Various manipulative skills, 
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thought to be needed to measure the volume of a liquid accurately, were 
checked by the judges. However, the nature of skills, such as if the pupil 
looks at the meniscus by keeping his/her eye at the correct position or if 
he/she is holding the cylinder in a correct way, seems to be very difficult 
to check by a judge without interrupting the pupil's work and within a 
certain period of time (lesson-time) when 2 groups of 20 pupils have to 
be assessed. 

It would be better to teach these necessary steps in an experimental 
session during lessons. Furthermore it could be preferable not to assess 
how a pupil performs the volume measurement, but to check at the end 
if the volume of the liquid in the cylinder is correct. 

Some reasons for the variation in the pupils' ability to retain certain 

facts taught earlier, have been put forward by Ausubel ' . In his 

> 
report, he suggested that there are two types of learning — rote learning, 

which is easy to forget, and meaningful, learning, which is retained by 

the pupil. J ■ : ! / 

The learning and retention of meaningful materials are primarily" 

affected by their interaction with relevant subsuming concepts already 

established in the cognitive structure. Materials learned by rote, on the 

other hand, are discrete and isolated entities which have not been 

related to any established concepts in the learner's cognitive structure. 

This is because they are not anchored to an existing identical system, 

X 
they are much more vulnerable to being forgotten. 
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Ausubel's assimilation theory of meaningful learning 07 ' 00 and 

retention has been criticised as "hopelessly vague" by Anderson et. al.°", 

who proposed a new theory called "Schemata theory" which suggests 
that knowledge is stored in various slots or "placehold" in the learner's 
mind and this can be instantiated with particular cases. 

However, in reply to this criticism, Ausubel wrote the following in a 

subsequent article:- "It is paradoxically appar/(nt that their theory of 
schemata, substantively almost identical with my more inclusive theory 
of anchorage ideas in cognitive structure, which they first seem to 
approve of, but now stigmatise as "hopelessly vague", and in their 
actual research, their schemata, unlike mine, do not facilitate 
meaningful learning". 

As the success of a pupil in the assessment of his/her practical work 
depends not only upon how much he/she can retain the 
skill/knowledge from the teaching session, but also upon how 
efficiently they were taught by the teacher, it is perhaps worthwhile to 

look at the findings of the CLIS project by Driver and Oldham . They 

have suggested the development of a model for a constructivist 
teaching sequence. Pupils are thought to engage in activities which 
encourage them to construct scientific ideas for themselves. This 
constructivist approach to curriculum development was suggested in 

response to the findings of an APU project , which reported that - 

science teaching was didactic and pupils found it difficult to understand 
certain scientific ideas. 



16 



The constructivist view of learning and teaching science, which 
emphasises that the learner is ultimately responsible for his/her own 
learning through a sincere effort, is also suggested in another article by- 
Driver and Bell 45 . 

The process of learning, i.e. the acquisition of the ability to enhance 
human performance, knowledge structure and conception can be 
explained both in terms of behavioural psychology and cognitive 
psychology, although the recent trend is to put more emphasis on the 
latter than the former. Much research has been done and papers have 
been published in this field which have been reviewed recently by 

Shuell 46 . 

Although the manipulative skills involved in science practical work are 
psychomotor skills, learning of these skills successfully, retaining them 
and reproducing/ transferring them to a new situation do indeed require 
a congitive ability of the pupil. 

The transfer of skills to a new situation not only depends on how well- 
organised and broad-based is the training session Which leads to 
meaningful learning, but also depends on whether the trainee can see 
that the two skills have elements in common, as has been mentioned by 

47 
Annett and Sparrow . Awareness of this link between two skills in 

two situations is explained by "the identical elements theory". 
However, another theory, called "the formal discipline theory", can also 
be used to explain the transfer of skills. It asserts that two skills depend 
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on a common capacity or potential which can be developed by 
practising. - . . 

In any study of assessment of practical work, its reliability and validity 
should be looked at before drawing a meaningful conclusion. In this 

context, the work reported by Eglen and Kempa ° can be cited. This 

study questions how concordant (reliable) are teachers' assessments of 
• students' manipulative abilities and how such concordance is affected 
by different assessment procedures. 

The results show that "open-ended" assessment procedures, Le. those 
largely based on impression grading without reference to -reasonably ; • 
specific assessment criteria, lead to considerable differences in grades : 
awarded by teachers for the same practical performance. 

When assessments are carried out with reference to criteria which relate 
directly to the performance to be assessed, less divergence, in the 
resulting grades is observed. However, a highly "objectified" assessment 
mode involving the use of a check-list does not result in full agreement 
of grades awarded. 

i 
In reporting the findings of their study, Geniel and Hofstein 

mentioned that assessment of practical work using a check-list of well 
defined criteria (objective mode of assessment) is much more precise 
and reliable than if the assessment is carried out by observing the 
overall performance of pupils, and marking by teachers according to 
their own preference (subjective mode of assessment). 



, 18 



Depending on the nature of the task involved and the apparatus used, 
assessment of practical work can be carried out either by following a ' 
process check-list (on-the-spot test items) and judging on the spot the 
performance of the manipulative skills involved, or by following a 
product based "end-check" test $&% and marking the final outcome of 
the extended task, i.e. the written reports of the experimental work, 
containing detailed records of observation, calculation, drawing gr&His ? 
making inferences, etc. However, the tendency among the teachers to 
use the second method is more common as it is often difficult for a 
teacher to assess a set of 20-25 pupils during a double lesson of 60-70 
minutes by following an on-the-spot check list of a reasonable length. 
The practice of assessing by marking the written report could be 
considered safe and valid if research could show clearly that there is a 
strong correlation between the achievement data obtained from two 
methods of assessment. Even so, assessment of" some manipulative 
skills on the spot during practical sessions is still essential and valuable 
as science is a practically based subject, and the ability to handle 
apparatus has its own merit. Cognitive assessment which emphasises 
factual knowledge recall has been a more widespread activity among the 
assessors (through examination). Recent moves to "process" in the field 
of assessment show that assessment of practical skill and abilities, i.e. 
psychomotor skills, is more problematic. This is because the practically 
based assessment is more demanding logistically, while the cognitive 
skill assessment can be done simply on the basis of pencil and paper 
work. However, all the "processes" (or "steps") involved in the practical 
work can be divided roughly into two catagories: (i) those that can be 
done on the spot during practical work and involve the use of 



19 



equipment, and (ii) those that can be done by intellectual ability or 
acquired knowledge and involve the use of pencil and paper. The former 
type of process skill can be called a -manipulative or psychomotor skill 
while the latter type can be called "cognitive skill". In order to perform 
psychomotor skill correctly, a pupil needs a certain degree of intellectual 
ability (cognitive skill) to remember what he has learnt earlier by going 
through a discovery approach and solving the current problems as he 
progresses through the experiments. Similarly, to be able to perform 
cognitive skills well, pupils should also do well in manipulative skill 
performance and learn from that to enhance their knowledge (cognition). 
So, on the whole, there is no clear cut boundary between these two types 
of skill, but some degree of overlapping exists. 

During their research work on the TAPS project, Bryce et. al. 25 have taken 

painstaking efforts in developing "process" tests which are content- 
dependent, i.e. having some form of attachment or relevance to the 
content of the science which the pupils are studying as a' part of their 
course. As a result of the presence of such content-dependency 31 ' 32 in 

process tests, pupils will certainly learn much about the scientific content 
of the topic through the learning of the skill/process involved. 

i 
Methods or steps involved in scientific work can be defined 31 as 

"processes" (some could be manipulative, whereas others could be 
cognitive) and successful completion of these steps will undoubtedly 
lead to the knowledge (product) and also learning of the Contents of 
Science. Scientific processes are not merely pedogogic 
means /skills/steps/methods used in the field of education, they are 
much more than that. These processes do have attachment to the 
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Contents of Science and do help pupils to acquire knowledge and 
achieve understanding of the subject (i.e. deliver the product). 

The term "process" has been viewed by some researchers in two 

senses of activity: in the pedagogic sense of activity (doing science) and 
in the scientific/psychological sense of activity (making inferences). 
Scientific skills (know-how) and scientific knowledge (know-that) are 

intricately inter-related and the distinction is subtle and elusive . They 

are interdependent and both form parts of a mental model useful for 
the learning process. 

"Process" in scientific, psychological and pedagogical terms should not 
be viewed in a limited sense, i.e. that it is free from the contents of 
science and would not lead to any useful "product" enabling pupils to 
increase their knowledge and understanding of^science. 

Many researchers, for example/Miller and Driver consider "Process 

skill" and "Process" to be the same thing, but Bryce and Robertson 32 

make a distinction between these two terms, even through it is a very 
fine one. By the word "Process", they refer^to a skill as such in an 
unidimensional sense, while "Process skills" seem to mean a sub-set of 
skills in a particular skill area. For example observing or hypothesising 
or inferring can be described as "process", while "process skills" are 
various objectives/steps/activities within the skill area. 

However, in this thesis such fine distinction has not been considered in 
great detail. Simply, all the basic skills in school science practical work 
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are called "Process" skills (they are process of one kind or another and 
involve some form -of skill by the pupil): some involve correct 
handling of the apparatus and are called manipulative skills, while 
others involve mental thought process (done by using pencil and paper) 
and are called Cognitive skills. Each skill may have a number of steps, 
i.e. a set of skills, which are used as a check-list/ marking scheme for on- 
the-spot assessment during a practical session or when making the 
written report. Such a report contains aspects of the experimental work 
which involves cognitive skills and also the final outcome of the whole 
experiment (i.e. product). 

1.4 Purpose of the present study 

Seddon and Barry 00 used the "reaction of thiosulphate solution with HC1 

as an experiment for their investigation to find the best mode of 
instruction out of 4 they used. Each group of pupils following a 
different mode of instruction was assessed by teachers following a check- 
list of various manipulative skills. A careful examination of this check- 
list will reveal that some of them are so simple that every pupil can do 
well. Some of them will hardly have any influence on the final result, 
and some of them can be executed so quickly that examiners will find it 
difficult to assess without interrupting the experiment. 

However, having considered these points the check-list can be 
shortened and a modified one prepared. It has been thought that 
accurate measurement of the volume of certain liquids and pouring 
them out completely into the reaction vessel are important 
manipulative skills capable of influencing the final results. 
Consequently assessment of these two skills is used to see how effective 
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the pupils are in their practical work. In this study assessments carried 
out by the judges have not been checked for reliability, nor has the 
outcome of the assessment been checked for its 'validity. 

At this point, another experiment, e.g. rate of reaction for Mg with 
dilute HO, was chosen in which the skill of volume measurement by 
using a measuring cylinder and pouring it out accurately are involved. 
It has been thought that these 2 skills are likely to influence the final 
results of the experiment, i.e. the time needed for the Mg ribbon to 
dissolve in HC1. In the present study we have assessed the pupils by two 
methods: (i) Method -T (spot check for both the manipulative skills) 
and (ii) Method -P (checking the final results, i.e. time for Mg to 
dissolve, against a standard value). We have also tried to correlate the. 
achievements of a pupil assessed in two different ways. Reliability of 
assessment carried out by judges marking only one skill as compared to 
those marking 2 skills, were studied. The result^ of this section led to 
the next part of this study. 

Almost all the pupils performed the skill of pouring out liquid almost 
perfectly. However measuring volume was not as uniformly effective 
and it was thought that there would be room for teaching this skill and 
to test it after the lapse of a certain period to find out how much they 
can reproduce that skill. In addition to the volume measurement by 
measuring cylinder, certain key skills involved in using a Bunsen 
burner during the heating of solid samples were also taught and 
assessed after a similar time lapse. This was thought to be important 
because the use of a Bunsen burner to heat solid samples occurs as a part 
of many experiments. 
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This work was a follow-up of the work carried out and reported by 

qc ...... 

Hubbard and Seddon 03 which shows that, in heating solids in a test- 
tube using a Bunsen burner, pupils can perform some skills very well, 
but some skills less so. This type of weakness in certain manipulative 
skills was thought to be responsible for the overall poor performance in 
using the Bunsen burner by certain pupils. 

In the present study some pupils have been given lessons on these skills 
and tested after a period in order to find out if they can reproduce these 
skills. Attempts have also been made to compare the performance of 
;- these pupils with that of the pupils who were not taught these skills. 

Finally, it was thought of interest to extend the investigation into the 
pupils' ability, not only in retaining, but also in extending and applying 
a skill taught earlier into a new and slightly more difficult situation. 

For this purpose pupils were tested in measuring a certain volume of 
liquid by using burette and heating a liquid in a test-tube, while they 
were given lessons on measuring volume by measuring cylinder and 
heating solid samples in a test tube some time earlier. 

Both the methods of assessment used here are based on well defined 
check-lists and thus can be classified as "objective mode" of assessment 

which has been thought ' to be more reliable than "subjective 

mode" of assessment (i.e. assessment based on impression grading by 
the teachers). 
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1.5 Scope and limitation involved in designing the experiments for the 
present study. 

In general the experimental materials should be educational — related 
and relevant to the course which is being studied by the pupils taking 
part in the research project. In that way it will be beneficial to them and 
the time spent by them for our investigation will be worthwhile. Care 
must be taken so that the experiment is not a repetition of the work that 
has recently been done during lessons as a part of their class-work. 

In preparing. a work-sheet, we must remember that each pupil should 
not only find something to learn, but also find that it lacks repetition. 
To make the work stimulating and attractive, the apparent objective 
may be different from the interest of the researchers. 

> 

So, we need to incorporate certain elements in the work which interests 
us as good research material, and to ensure the whole work appears to 
be interesting, educational and relevant to the tasks pupils have done in 
the recent past and will do in the future. In other words, they will feel 
that some form of progress is being made step by step. Furthermore, in 
addition to the manipulative skills which we wish to study, there 
should also be sufficient materials involving cognitive skills which 
motivate more able pupils in doing the work. 

To achieve the desired objective, both for the pupils and the researchers, 
a large number of experiments can be used. However, one must bear in 
mind the limitation of laboratory facilities in a school. To keep the cost 
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down, chemicals should not be expensive. Furthermore, there should 
not be any-major safety problems in handling the chemicals by the 
pupils during the experiment. 

Regarding the selection of the apparatus, one should be careful to use 
those items which are available in the school laboratory and likely to be 
used by pupils during the period of their study in a school. 

The amount of apparatus available in a school laboratory may not he. 
adequate for this type of experiment where each pupil will, require. an . 
individual set. However, this can be overcome by borrowing from the 
: University of. East Anglia or a local school. 

Laboratory space often influences the number, of pupils that can be 
accommodated in one session. Time for each double-period may be 
used for each session without causing too much" disruption in the daily 
routine of the pupils. The amount of work to be included in. the work- 
sheet is also carefully planned. . : - ..,..;.. i. , 



1.6 Statistical methods employed in data analysis 

Two important statistical techniques, nameiy analysis of variance, and 
test for correlation, were used to analyse the experimental results and 
understand their statistical significance. 

Analysis of variance (ANOVA) 

In designing an experiment, the researcher is concerned with the 
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difference between two conditions of a variable. The difference between 
the subjects' performance under two different conditions of a variable is 
noted and interpreted. In order to understand the statistical significance 
of this difference, statistical analysis is carried out. There are various 
methods for this purpose, but the most suitable method for an 
experiment depends on the number of variables, number of conditions 
of a variable, related (matched) /unrelated (different) subject, and the 
scale (nominal /ordinate /interval etc.) used for data collection for the 
subjects under different conditions of the variable. 

The experiments described in the thesis have two variables, each 
variable having 2/3 conditions. The subjects were different under each 
condition and they were obtained by randomising the set of pupils 
(same age range, similar academic ability and mixed gender). The data 
(achievement of pupil) were collected on an interval scale. Having 
considered these aspects, it has become clear that 2-way randomised 
ANOVA would be appropriate to carry out statistical analysis of the data 
Collected in each of the three experimental design given in Chapter 2, 3 
and 4. 

Although the frequency distribution of the data, was not the completely 
symmetrical shape of normal distribution in each case, a 2-way ANOVA 
(a parametric test) could be applied to study the statistical significance of 
the results, i.e. difference between two conditions of a variable. This is 
because the subjects' score were gathered numerically on an interval 
scale and this method gives the opportunity to study the interaction 
between two variables. Although for a parametric test, homogeneity of 
variance, i.e. approximately the same variability of score for each 
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experimental condition is a desirable condition, it is not essential as 
long as there is an equal number of subjects in each experimental 
condition, and it holds good for each of the experiments described in 
this thesis. 

Furthert^oyeji parametric tests are reasonably robust as far as the above- 
mentioned conditions (criteria) are concerned, it is unlikely to get a 
seriously impaired answer about the % probability of obtaining variance 
ratios. The letter "F" is used to denote the statistics (variance ratio) 
obtained after completing calculation for an ANOVA and if can be 
defined as 

F = Variability between scores due to independent variable .-y.-. ) 

Variability between scores due to other variable : : > 

F= Variance due to source of variance 

Variance due to error ^ 

According to experimental hypothesis, the variability in score is 
produced due to a difference in the condition of the independent 
variable. In order to avoid contributions from other sources, such as 
intelligence, eagerness to learn, initial knowledge about the subject, 
gender difference etc., the subjects of the experiment were randomised 
in groups for studying different experimental conditions. However, one 
still can claim, giving rise to the Null hypothesis, that the variability is 
due to other unknown variables than the independent variable 
manipulated by the researcher. 

Analysis of variance gives us the opportunity to obtain a ratio (F) 



28 If 



fl 



between the variance due to the source of variable, and the variance dne 
to another variance (also known as error factor). If the »F» ratio is high, 
one can suggest that the variability in score is mainly due to the 
independent variable and the contribution from other sources is 
negligible. 

Using statistical methods outlined in 'The Numbers Game - Statistics 
for psychology" 50 , one can calculate F A and F B values for the 
independent variable (A and B) and also the interaction of the variables 
F AxB (in case of a 2-way ANOVA). 



The significance 



of F value can then be checked by.lookjng at the critical 
value of F at the corresponding df (degree pf.freedOTO.teyd given in the 
data table for critical values of R If the calculated F value is higher than 
the critical value for p = (say) 0.01 or 0.05 , then it can be considered as 
significant at p<0.01 or 0.05 level respectively. Thi* means that the 
probability of variability in the score due to error factors is less than 1% 
or 5% respectively and thus the "Null hypothesis" is rejected at this 
level and the experimental hypothesis is accepted. 

A number of 2-way randomised ANOVAs have been used to analyse 
the experimental results in chapters 2, 3 and K as the data are thought to 
satisfy the necessary conditions described earlier. 
Tpsts for Corre lation 

Correlation design looks at a possible link/relationship between two sets 
of data (or score) achieved by a group of subjects under two different 
experimental conditons (or two different tasks). Here the main 
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objective is to see whether the subjects' performance under two 
conditions- is correlated, either positively, or negatively. As an example, 
one can cite a pupils' mark in two subjects, e.g. in Physics and 
Mathematics, obtained by a set of pupils. If there is a positive 
correlation between the two sets of marks, pupils having the highest 
mark in physics will have the highest mark in mathematics, while the 
pupils obtaining the lowest mark in physics will have the lowest mark 
in mathematics; However, if the correlation is negative, pupils having 
the high mark in physics will have the low mark in mathematics. 

In order to see if there is a correlation, the mathematics score of a pupil 
is plotted against his/her physics score. This type of plot is known as a 
-.-; seatterogram. It may or may not be possible to visualise easily a clear 
relationship between two sets, of data. Thus a statistical technique can be 
employed to calculate the correlation value and find if it is statistically 
significant enough to reject the "Null Hypothesis" (i.e. the apparent, 
relation is due "to chance alone) and accept that there is a significant 
correlation. 

It is possible to use either a non-parametric test (Spearman rank 
correlation coefficient) or a parametric test (Pearson product moment 
correlation coefficient) to test the significan/ of a correlation between 
two variables. As the data of the experiments reported in this thesis are 
in an interval scale and generally meet other criteria for a parametric 
test, Pearson -r values were calculated. The correlation "r" value, is 
mentioned in terms of a number ranging from -1 (perfect negative 
correlation), through (no correlation), to +1 (perfect positive 
correlation). 



30 



After calculating the Pearson -r value (for method of calculation/ see 

"The Number Games - Statistics for Psychology' ), one needs to look at 

a statistical table of critical value of "r" to discover the probability of this 
amount of correlation by the chance factor and thus predict if this '^" 
value means any significant correlation between the two variables. 
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CHAPTER 2 



ASSESSMENT OF PRACTICAL SKILLS 
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2.1 Aim 

The aim of the work presented in this section of the thesis is to study 

the reliability and validity of assessment. 

Assessment of a manipulative skill of a pupil carried out by a number of 
judges using a check-list is ideally expected to be the same. However, 
one may ask, in reality, if that is so! The nature of 
agreement/ disagreement among a number of judges is worth 
considering in order to answer such questions and to understand the 
reliability of assessment. 

During the assessment of a set of pupils by a teacher, as is usually the 
case in real life, more than one skill may have to be assessed: One may 
ask if it would be less reliable when a judge concentrates on assessing 
more than one skill, instead of assessing only one skill. Here attempts 
have been made to find answers to such a question. 

During a practical session, pupils carry out manipulative skills in order 
to collect data/record observations/ deduce inferences which are 
submitted as written reports. As the task of assessing the skills by on- 
the-spot observation during practical sessions (i.e. Method -T) is time- 
consuming and thus difficult for a teacher, compared with the 
assessment carried out by checking the written report (i.e. Method -P) 
which can be done at home, it may be worthwhile to see if there is a 
significant correlation between the standard of achievement of a pupil 
assessed in both ways. So a comparison between these two methods of 
assessment is also undertaken. This will help us to find out if the result 
of assessment of a pupil by Method -P can be used as a measure of their 



32 



■•;!• 



performance in executing the manipulative skills during a practical 

session. ~ , 

2.2 Development of the study materials - Preliminary runs 

i) Method 

Attempts were made to design an experiment which will allow us to (a) 
assess on the spot by observation of the performance of pupils in 
carrying out the skill of measuring a certain volume of liquid using a 
measuring cylinder, (b) assess by marking the final results of the whole 
experiment, and (c) study the reliability and validity of assessment of 
the skill. 

. Having given some thought, it was decided that the following 
experiment "reaction of Mg with diLlS"^;^. ...,/. 
experiment suitable for this purpose and would satisfy the scope and 

limitations described earlier. 

> 
■ Prior to the final run, a trial run was carried out by using about 25 pupils 
of an age range of 13-14 years from Norwich School, The Close, 
Norwich. The objective of this trial run was to find out the following: 

(a) If the work-sheet is self-explanatory and easy to follow by all 

the pupils. 

(b) Does the method work or is there 1 any need for further 

modification? 

(c) How many runs (number of experiments) can be carried out 

within a definite time limit (say one hour)? 

(d) Are the apparatus and chemicals adequate for the work? 

(e) How many pupils can be assessed by each judge for the 
measurement of volume and pouring out the liquid? 
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This trial run helped me to eliminate any deficiency in the proration of 

- - • r 

the instruction in the work-sheet and also the arrangement of apparatus 

and chemicals. 

(it) Discussion of the out-come of the trial runs 

The length of the Mg ribbon was a problem. As it was 5 cm in length, 
when it was dropped into the flask containing dil. HC1, occasionally it 
fell horizontally and sank into the acid. However most of the time it 
dropped vertically and a part of it remained above the acid. So it was 
decided to cut the Mg ribbon into 2 parts, each of 2.5 cm length and the 
pupils were asked to use 2 pieces of Mg ribbon for each experiment. In 
this way the ribbon dropped horizontally into the flask and sank into 
the acid each time. No other modification was necessary. 



2.3 Final run 
(t) Method 

The tasks for the pupils were to find out the time taken by 2 pieces (2.5 
cm each piece) of Mg ribbon to dissolve completely in a mixture of dil. 
HC1 and water (see work-sheet 2.1 in Appendix 1.1). 

Four sets of 4th form pupils (14-15 year old) from a local comprehensive 
school (Earlham School) took part in this experiment. About 20 pupils 
(boys and girls) were in each set and they were randomly allocated a 
work-station before the work-session. 

One large teaching laboratory in the School of Chemical Sciences at the 
University of East Anglia was used for this purpose. Each work-station 
was numbered so that each pupil, randomised at the door and provided 
with numbered labels, could come quickly to the matching numbered 
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desk. 



In order to facilitate the on-the-spot assessment by the judges who each 
carried a check-list and identified the pupils only by their numbers, the 
following arrangements were made. The pupils carried identifying 
number labels in both of their arms and the number plates at the work- 
stations were raised to a reasonable height so that the judges could see it 
clearly from a reasonable distance. 

Each session lasted for about an hour. In between two sessions, there 
was an interval of about 15-30 minutes during which the used glass- 
ware was removed and new chemical and ; apparatus were laid out. A 
list of chemicals and apparatus used in this experiment is given in 
Appendix 1.2. 

> 
Each pupil was asked to carry out 6 runs (experiments) using 6 acid- 
water mixtures of different volume ratios. Work-sheets and other 
required materials were supplied at the work-desk before the start of the 
experiment. 

Nine qualified science-teachers from local schools participated as judges. 
They were divided into 3 panels and were assigned for carrying out the 
on-the-spot assessment by observation of the performance of the skill in 
the following way: 



ft- 



M 





Panel 




A 


B 


C 


M + P 


M + P 


M + P 


M 


P 


M + P 


- M + P 


M 


P 


P 


M + P 


M 



Session_ 

(trial run) 
1 

2 

U and P stand for measuring volume by a cylinder and pouring the 
liquid out of the cylinder respectively. 

the experiment carried out by the first set of pupils was .used by-the 
judges as a trial run to gain a clear idea of the nature of their task in thrs 
experiment. This was helpful for their asessment work in the three 

subsequent sessions. > 

Before the start of the experiment, the judges were briefed about the 
nature of the task and clear instructions were given about the job in 
each session, as has been outlined in the table above. 

It was thought that, among all the skills involved in this experiment, 
*e two most important skills to affect the final results, i.e. the time for 
Mg to dissolve, are: (1) measuring the volume of acid/acid + water 
mixture, and (2) pouring out the liquid into the flask. Each judge 
assessed each pupil once for measurement of volume (M) or pouring 
out the liquid <P) or both (M) and (P), according to the work-scheme 
given earlier. The judges were instructed to give 1" mark for correctly 
executing the skill and "0" for failing to do so. 
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The judges- checked -the accuracy of the volume measurement by taking 
a close look at the measuring cylinder and seeing if the lower meniscus 
was within 

+ 0.5 cm of the volume to be measured. The accuracy of the pouring 

skill was checked by looking at each pupil's effort to drain the last drop 
of liquid out of the cylinder. 

In order to assess on the basis of the outcome of the extended task, the 
results of the experiment submitted by the pupils in the form of a 
written report (i.e. the time taken for Mg to dissolve reported by the 
pupil) were checked against the values determined by me. For this : 
purpose each of the 6 experiments [see the work-sheet 2.1 in Appendix 
1.1] was repeated 8 times by me. The volume of acid was measured 
accurately, the pouring out operation was carried with great care and 
finally the exact time taken by the Mg ribbon to dissolve completely was 
recorded accurately. The mean value of time for each experiment (with 
the standard deviation as the accepted error limit) was used as the . 
standard result to carry out this assessment. A table containing these 
results and s.d. values is given in the Appendix 1.3. Probable 
explanation for the variation of the s.d values with the variation of 
acid:water composition is also given there. 
(ii) Results and Discussion 

As the first assessment session during the experiment was used as a trial 
run and referred to as session -0 in this section, results of only the 2nd, 
3rd and 4th sessions of the experiment will be used in the calculation 
here and they will be referred to hereafter as Session -1, Session -2 and 
Session -3 respectively. 
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According to the marking scheme, each pupil was marked in the 
following way: a panel of 3 judges marked only the volume 
measurement skill, another panel of 3 judges marked only the pouring 
out skill and the 3rd panel of 3 judges marked both the skills. As each 
judge was instructed to give 1 mark for correct execution of a skill, each 
pupil could get a maximum of 6 marks for volume measurement, and 6 
marks for pouring out the liquid. 



The data in the judges' assessment sheets for all the 3 sessions (see 
Apjendix 1.4) has been analysed and the information available in these 
tables was used to study the reliability and validity of assessment. .. 



In order to look at the reliability of assessment, the following two 
aspects have been considered: 

(i) comparison has been made between the achievement of the pupils 
for "their volume measurement skill assessed by a panel (say "A") of 3 
judges for the volume measurement alone, and that assessed by another 
panel (say "C") of 3 judges for both the volume measurement and 
pouring out the liquid. Similar comparison can be made between the 
achievement of the pupils for their liquid pouring skill assessed by a 
panel (say "B") of 3 judges for the pouring out liquid alone, and that 
assessed by the panel "C" for both volume measurement and pouring 
out liquid mentioned earlier. 
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(ii) The possible ways in which a panel of 3 judges can mark a pupil 
during assessing a skill are shown in the table below: 



Possibilities of marks 
for a pupil from 3 judges 



Overall achievement 



of the pupil 



■ 


1 


1 


1 


3 marks 


.. . . 


1 


1 





2 marks 




1 








1 mark 


■ 








. 


mark 



Agreement 
a.'Vn.o'nffi 

the judges 
C Vesi 1 > 

No:0) 
1 


1 .' " 



When a pupil gets 3 marks (1 + 1 + 1) or "0" mark (0 + + 0), the 3 
judges in a panel are in perfect agreement, otherwise the judges are not 
in perfect agreement. A situation of perfect agreement among the 
judges is represented by "1", while the lack of perfect agreement is 
indicated by "0". 



Tables 2.1, 2.2, 2.3 and 2.4 contain summaries of the analysis of the data 
related to the reliability of the assessment studies. Tables 2.1 and 2.2, 
derived from further analysis of data in Table 2.6, 2.7 and 2.8, show the 
mean values of achievement when all the pupils in a session have been 
assessed for one skill, e.g. volume measurement/pouring out liquid, by 
a panel of judges, compared with the situation when both the skills 
have been assessed by another panel of judges. 



Tables 2.3 and 2.4, derived from further analysis of data in Table 2.9,2.10 



it:-- -j. 

F ■ 

r 
i 
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2.11, show the mean values of total agreement among the judges in a 
panel when all the pupils in a session have been assessed for one skill, 
e.g. measuring volume/pouring out liquid, by a panel of judges, 
^compared with the situation when both the skills have been assessed 
by another panel of judges. 

It appears that the mean value of the achievement for any set (or, for all 
the 3 sets together) for volume measurement does not differ 
significantly whether a panel of judges assessed it alone or another 
panel of judges assessed both skills of measuring volume and pouring a 
liquid. 

A similar conclusion can be drawn for the mean value of the 
achievement for pouring out skill, whether a panel of judges assessed 
that skill alone or another panel of judges assessed it together with the 
volume measurement skill. ^ 

The mean value of performance for the pouring out skill is much 
higher than that for the volume measurement skill. The results suggest 
that the pupils in every set have been able to carry out this "pouring out 
the liquid" skill very effectively, almost perfectly. However poor 
performance (about 33% of the total marks; for volume measurement 
suggests that the pupils would require further lessons in this skill. The 
final results of the experiments, which have been used as the basis for 
assessment by Method -P, are directly influenced by the pupils' ability to 
carry out the volume measurement skill efficiently. 

With regard to the agreement/disagreement among the judges, it can be 
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said from the Tables 2.3 and 2.4 that the judges are in total agreement for 
about 50% of the sample during the assessment of the volume 
measurement skill, irrespective of whether it was' assessed alone or 
assessed in conjunction with the pouring out skill. The amount of total 
agreement among the judges for assessing pouring out skill was about 
100%. As with the volume measurement skill, the pouring out skill 
showed no significant difference in the amount of total agreement 
depending on whether it was assessed alone or assessed with the 
volume measurement skill. 

For further analysis of the data of total agreement among the judges, 
several 2X3 ANOVA based on randomised groups were carried out. 
They were as follows: ■-; ■■.,.: .!; ■ < ,■; 

(i) 2 ways of assessing volume X 3 panels of 

measurement skill (A) judges (B) 

(ii) 2 ways of assessing pouring X 3 panels of 

out skill (A) judges (B) 

(ii) 2 ways of assessing volume X 3 sessions 

measurement skill (A) (sets of pupils)(B) 

(iv) 2 ways of assessing pouring X 3 sessions 

out skill (A) (sets of pupils)(B) 

Information derived from the ANOVA is in Table 2.5. 

None of the Fa value is significant even at p <0.05 level. This suggests 

that whichever way a skill, i.e. measurement of volume or pouring out 
liquid, is assessed, the amount of total agreement among the judges is 
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unaffected. 

This finding lends support to the earlier conclusion regarding the 
reliability of assessment based on the fact that the number of total 
agreements does not differ significantly between the two ways of assessing 
volume measurement or pouring out skill. 

Furthermore, as none of the Fg or F^b is significant even at p<^0.05, 
assessment being carried out by three panels of judges and in three 
sessions also do not have any significant effect on the number of 
agreement among the judges and hence on the reliability of the 
assessment. . 

The achievement based on the assessment by Method -T of performing 
each of the practical skills for all the pupils in each of the 3 sets working in 
3 sessions has been presented in Tables 2.6, 2.7 and 2.8. Appendix 1.4 
contains the "raw-data", i.e. 3 panels (A, B, C) of judges' mark-sheets for all 
the 3 assessment sessions. The data in each session were analysed in the 
following way. Marks obtained by each pupil for his/her volume 
measurement/liquid pouring out skill were added togehter and presented 
as the achievement data based on the assessment Method -T in Tables 2.6, 
2.7, 2.8. As each pupil was assessed by 3 judges for volume measurement 
alone, 3 judges for pouring out alone and 3 judges for both skills, the 
maximum marks that a pupil can get for correct execution of each skill is 
6. In each table, there is also a column of data showing the achievement 
of these pupils based on the assessment by Method -P which involves the 
judgement of the accuracy of the final results of the experiment (i.e. the 
time taken by the Mg ribbon to dissolve completely). A maximum of 6 
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marks can be obtained by a pupil for the correct results of 6 runs in 
experiment 2.1. 

Using judges' mark-sheets (see Appendix 1.4 J/Pupils' achievement data 
in Tables 2.6, 2.7, and 2.8, situations of agreement/disagreement among 
the judges in assessing the pupils in all 3 sessions have been analysed and 
are listed in Tables 2.9, 2.10 and 2.11. 



The marks awarded by the judges (assessment by method-T) for volume 
measurement and pouring out a liquid was plotted against the marks, 
based on the assessment by Methid -P, for all the pupils in the 3 sessions in 
Figures 2.1 and 2.2 respectively. Such a comparison can be used as a 
measure of validity of assessment. The data plotted in each figure appears 
to be widely scattered and it is quite difficult to distinguish any clear 
relationship (trend) between data gathered from two methods of 
assessment. For this Pearson-r (i.e. Pearson r product, moment corllation 
coefficient) values were calculated and the results are shown in Table 2.12. 
The Rvalues appear to lie within ± 0.5 suggesting the lack of a very strong 
(near to perfect) correlation. However, it is worthwhile to look at the 
statistical significance of the existing correlation as shown by the r-value. 

i 
Here the experimental hypothesis is one tailed and a positive correlation 
is expected. It is thought that a pupil's achievement for 
performing volume measurement skill/pouring out skill assessed by 
Method -T will be associated with the achievement of the 
pupil assessed by Method -P (i.e. marking the written report on the 
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outcome of the extended task of the whole experiment). According to the 
"Null Hypothesis" the correlation "f" value obtained from calculation 
could be due to chance factor; So the calculated "¥*" value is compared 
with the critical value of 'V' a t the corresponding degree of freedom df = 
58 (df = N-2, where N = no. of subjects in the experiment). 

One of the "ff' values calculated for the volume measurement skill ^ = 
0.306 (for the 6th run) appears to be higher than the critical value at p = 
0.025 but lower than the critical value at p = 0.01. This suggests that the 
calculated "<p' value is significant at p^0.025.1evel, i.e. % probability of 
correlation by chance factor is equal to or less than 2.5% and thus the 
"Null Hypothesis" is rejected and experimental hypothesis is accepted at p 
/0.025 level. However, the rest of the lh f" values calculated for volume 
measurement skill and all the jr f" values for pouring out skill seem to be 
lower than the lowest possible critical value of f (= 0.2306) for df = 58 
which is at p = 0.05. This suggests that the % probability of the correlation 
occurring by chance factor is higher than 5% and thus the "p" values are 
not significant statistically. 

A significant correlation, even though at p^0.025 level, shown (by the 
result of one of the runs of volume measurement experiment) between 
the achievement data of the pupils when assessed by method -T and that 
when assessed by method -P, suggests that the accuracy of the volume 
measurements do indeed have a strong influence on the final result, i.e. 
time needed for Mg ribbon to dissolve completely and assessment of the 
final result submitted as a written report can be used as a substitute for the 
assessment of volume measurement skill on the spot. 
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Although the "pouring out liquid" skill is an important technique in the 
whole experiment and should have a strong influence on the final result, 
the lack of a strong correlation between the achievement data from 
method -T and that from method -P is likely to be due to the fact that 
performance of both the volume measurement skill and pouring out skill 
affect the accuracy of the final result and hence the pupils' achievement in 
assessment by method -p. From the discussion of the experimental 
results, which has been given earlier, it has become clear that pupils 
perform the pouring out skill almost perfectly, while their performance in 
volume measurement skill is not very good (obtained only 33% of the 
total marks). So, pupils' low achievement data obtained from the 
assessment method -p is perhaps the direct;, consequence, of .their poor 
performance in volume measurement, while the contribution of the 
pouring out skill on the final result can be considered as constant (all the 
pupils performed it very well). 

• ■ . > ■ 



IP" 
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Table 2.1 

Achievement of pupils in carrying out 2 manipulative skills involved in 
experiment 2.1 



(The data-analyses were based on the fact that the 3 sets of pupils were assessed in 3 
sessions) 



Session 



Skill of measuring 
volume of liquid 



M 



M+P 



Skill of pouring out 
liquid 



P. 



M + P 



1+2 + 3 



X 


1.05 


1.30 


2.85 


2.80 


s.d. 


+0.86 


±0.90 


±0.36 


±0.40 


X 


1.00 


0.95 


3.00 


2.85 


s.d. 


+1.09 


±0.92 


±0.00 


±0.36 


X 


0.80 


0.90 


2.75 


2.75 


s.d. 


±0.93 


±0.94 } 


±0.43 


±0.43 


X 


0.95 


1.05 


2.87 


2.80 


s.d. 


+0.97 


+0.94 


+0.34 


+0.40 



x = mean value of achievement per pupil (maximum value is 3) 

s.d. = standard deviation in "x" 

M/P = only volume measurement/pouring out skill was assessed by a panel of 

judges in a session 
M + P/M + P_= Both the skills were assessed by a panel in a session 

(only the underlined skill assessment was used in calculation) 
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Table 2.2 

Achievement of pupils in carrying out 2 manipulative skills involved in 

experiment 2.1 

(The data analyses were based on the fact that 3 panels of judges carried out the 

assessment) 



Panel of judges 




Skill of 


measuring 


Skill of 


pouring out 






volume of 


liquid 


liquid 






M 




M + P 


P 


M + P 


A 


X 


1.05 




0.95 


2.75 


2.85 




s.d. 


±0.86 




±0.92 


±0.43 


±0.36 


B 


X 


1.00 




0.90 


2.85 


2.75 




s.d. 


. +1.09 




±0.94 -. 


±0.36 


±0.43 


C . 


X 


0.80 




1.30 


3.00 


2.80 




s.d. 


±0.93 




±0.90 


±0.0 


±0.40 


A + B + C 


X 


0.95 




1.05 


2.87 


2.80 




s.d. 


±0.97 




±0.94 


±0.34 


±0.40 



x = mean value of achievement per pupil (maximum value is 3) 

s.d. = standard deviation in "x" 

M/P = only volume measurement/pouring out skill was assessed by a panel of 

judges in a session 
M + P/M + P_= Both the skills were assessed by a panel of judges in a session 
(only the underlined skill assessment was used in calculation) 
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TABLE 2.3 

Amount of total agreement among the 3 judges in a panel in assessing the pupils! 

skill in experiment 2.1 

(The data analyses were based on the fact that the 3 sets of pupils were assessed in 3 

sessions) 



Session 




Skill of 


measuring 


Skill of 


pouring out 






volume of 


liquid 


liquid 






M 




M + p 


P 


M + P 


1 


X 


0.35 




0.30 


0.85 


0.85 




s.d. 


+0.48 




+0.46 


+0.36 


+0.40 


2 


X 


0.60 




0.45 


1.00 


0.85 




s.d. 


+0.49 




+0.50 


+0.00 


+0.36 


3 


X 


0.50 




0.50 


0.75 


0.75 




s.d. 


±0.50 




+0.50 


+0.43 


+0.43 


1+2 + 3 


X 


0.50 




> 
0.42 


0.87 


0.80 




s.d. 


+0.50 




+0.49 


+0.34 


+0.40 



x = mean value of agreement (maximum value is 1) 
s.d. = standard deviation in "x" 

M/P = only volume measurement/pouring out skill was assessed by a panel of 

i 
judges in a session. 

M + P/M + P = Both skills were assessed by a panel of judges in a session 

(only the underlined skill assessment was used in calculation) 
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Table 2.4 

Amount of total agreement among the 3 judges in a panel in assessing the pupils' 
skill in experiment 2.1 

'(The data analyses were" based on the fact that 3 panels of judges carried out the 
^assessment) 



panel of judges 




Skill of 


measuring 


Skill of 


pouring out 






volume of 


liquid 


liquid 






M 




NL+P 


P 


M + P 


A 


X 


0.35 




0.45 


0.75 


0.85 




s.d. 


+0.48 




+0.50 


+0.43 


+0.36 


B 


X 


0.60 




0.50 


0.85 


, 0.75 




s.d. 


+0.49 




+0.50 - 


. +0.36 


•••- ±0.43 


C 


X 


0.55 




0.30 


1.00 


0.85 




s.d. 


+0.50 




+0.46 


+0.36 


+0.40 


A + B + C 


X 


0.50 




> 
0.42 


. 0.87 


0.80 




s.d. 


+0.50 




+0.49 


+0.34 


+0.40 



: x = mean value of agreement (maximum value, is 1) 
rs.d. = standard deviation in "x" 
M/P = only volume measurement/pouring out skill was assessed by a panel of judges 

in a session 
M + P/M + P = Both skills were assessed by a panel of judges in a session, 
(only the underlined skill assessment was used in calculation) 



49 



Table 2.5 
sfcjjnary tables for several 2x3 ANOVA based on completely randomised groups for 
m&he<- reliability of assessment (agreement among the judges) of pupils' ability to 
f 'perform scientific manipulative skill in experiment 2 . 1 

/ d ) Measuring volume of liquid using measuring cylinder 



IB 



"Source of 
^variance 



SS df MS P F Significance of 

(obsv.) (crit) F (obsv.) 

atp=0.05. 



tlTl2 method gr.) 



0.209 
FES' (3 panels of judges) 0.517 

ILaX b ■ ■■■ ■",' °- 616 

||ror 28.450 

lotal 29.792 



1 


0.209 


0.837 


4.00 


n.s. (p { 0.05) 


2 


0.259 


1.040 


3.15 


n.s. (p <( 0.05) 


2 


0.308 


1.234 


.3.15 


n.s. (p ^ 0.05) 


114 


0.249 


.' - 


- : 




119 


_ 


_ 


_ 





"(b) Pouring liquid out of measuring cylinder 



|§ource of 
J§ariance 



SS 



df 



MS 



, F % / ?_,_, Significance of 
(obsv.) (crxt) , 

„ „- F (obsv.) 

at p=0.05 * ^^ av -' 



[TJ (2 method gr.) 0.133 

tM(3 panels of judges) 0.267 
ftXX B 0.467 

ffcor 15.800 



fi.~ 



F—Total 

mm 



16.667 



1 


0.133 


0.957 


4.00 


n.s. 


(p ^ G*'Q5) 


2 


0.134 


I 

0.960 


3.15 


n.s. 


(p ( 0.05) 


2 


0.234 


1.680 


3.15 


n.s. 


(p<0.05) 


114 


0.138 


- 


- 






119 


_ 


_ 


_ 







/continued 
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Table 2.5 (continued) 
if: Measuring volume of liquid using measuring cylinder 



W:<~ 

IIS 



Igource of 
^-Variance 



SS 



df ... 



MS 



' A (2 method gr. 

-.- -•■■ : " 

V B (3 sessions) 

^~&~ 
prior 

iTotal 



0.208 

1.067 

0.067 

28.450 

29.792 



1 

2 

2 

114 

119 



0.208 
0.533 
0.033 
0.249 



:|d[TPouring liquid out of measuring cylinder 



Source of 
■variance 



SS 



df 



MS 



A|( 2 method gr.] 
B 7(3 sessions) 

|x;b - 

.-Error 

Spa' 



0.133 

0.617 

0.117 

15.800 

16.667 



1 

2 

"2 

114 

119 



0.133 
0.308 
0.058 
0.138 



IBE 



fables 2.5 (a) - (d) : 



ft 



(obsv. 



(crit) 
at p=0.05 



Significance of 
F (obsv.) 



0.833 4.00 
2.138 3.15 
0.134 3.15 



n.s. (p /0.05) 
n.s. (p<^ 0.05) 
n.s. (p./ 0.05) 



F F 
(obsv.) (crit) 

at p=0.05 
> 



Significance of 
F (obsv.) 



0.957 4.00 
2.219 " 3.15 
0.421 3.15 



n.s. (p </ 0.05) 
n.s. (p <( 0.05) 
n.s. (p <( 0.05) 



S^tobsv.) 



(crit) 



n. 



denotes the value of F ratio obtained from the experimental 
results. 

denotes the critical value of F ratio at a given level of 
significance (i.e. p value) and at the appropriate df 
values (see reference 50 for the relevant statistical 
tables) . 

means "not significant". This refers to the fact that the 
value of observed F ratio is less than the critical value 
of F at p=0.05 and thus F (observed) is not significant at 
p /0.05 level. 



Table 2.6 
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Achievement of pupils (participating in Session 1) for carrying out 2 manipulative 
skills (measuring volume of liquid and pouring out liquid) involved in experiment 
2.1, assessed both by (1) Method -T (i.e. on the-spot observation of the performance of 
the techniques) and (2) Method -P (i.e. checking the reported results/final product of 
the experiment) " , " ' 

Achievement in assessment by Method -T Achievement 
in assessment by 



Method -P 



pupil No. Marks for measuring Marks for pouring liquid Marks for 

_ - vcLu-lTie, reported results 

M li+P Total marks P M+P Total marks 

(6max) (6max) (6max) 



■ 1 2 1 5 6 5 o , t. 

2 - - - - 

3 3 14 3 3 6 4 

4 2 2 2 3 5 4 

5 11 2 3 2 5 6 

6 2 3 5 3 3 6 5 

7 1 1 3 2 5 4 

8 10 1 3 3 6 1 
9 

10 1 1 2 3 3 6 3 

11 11 2 3 y 3 6 6 

12 • 1 1 2 2 4 5 

13 13 4 3 3 6 2 • 

14 1 2 3 3 3 6 4 

15 2 1 3 3 3 6 4 
16. 2 2 4 3 3 6 3 

17 0003 3 6 3 

18 03 3 2 2 4 5 . 

19 1 1 2 3 ' 3 6 4 

20 1 2 3 3 3 6 4 

21 1 1 3 3 6 3 
22 

23 1 1 3 3 6 1 

M/Pj Only volume measurement/pouring out skill was assessed (max. marks for each 
skill = 3) 

M + P/M + P: Both the skills were assessed, but only the underlined skill's mark is 
involved in the calculation. 



Table 2.7 
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Achievement of pupils (participating in Session 2 ) for carrying out 2 manipulative 
skills (measuring volume of liquid and pouring out liquid) involved in experiment 
2.1, assessed both by (1) Method -T (i.e. on-the-spot observation of the performance of 
the techniques)- and (2) Method -P (i,e, checking the reported results /final product of 
the experiment). 



pupil No. 



Achievement in assessment by Method -T 



Marks for measuring 
volume 



Marks for pouring liquid 



M 



M+P 



Total marks 
(6 max) 



M+P 



Total marks 
(6 max) 



Achievement 

in assessment 

by Method -P 

Marks for 

reported 

results 

(6 max) 



> ■ 1 


2 


2 


1 


"3 





4 





5 





6 





7 


1 


8 


1 


9 





10 





11 


1 


12 


2 


-;:■ 13 


1 


: ■ 14 


3 


15 


2 


16 





17 


3 


18 


3 


19 





20 






1 




1 



1 
1 
1 





1 
1 

2 
2 
1 
3 
3 
1 




3 
1 

1 

1 
2 
2 


1 
3 
2 
5 
4 
1 
6 
6 
1 




3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 



3 
3 
3 
3 
2 
3 
3 
3 
3 
3 
3 
3 
3 
2 
3 
3 
3 
3 
3 
2 



6 
6 
6 
6 
5 
6 
6 
6 
6 
6 
6 
.6 
6 
5 
6 
6 
6 
6 
6 
5 



3 
6 

4 
4 
5 
4 
1 
3 
6 
3 
4 
5 
3 
5 
5 
4 
2 
4 
3 



M/P: Only volume measurement/ pouring out skill was assessed (max. marks for 
each skill = 3) 

M.+ P/M + P: Both the skills were assessed, but only the underlined skill's mark is 
involved in the calculation. 
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Achievement of pupils (participating in Session 3 ) for carrying out 2 manipulative 
skills (measuring volume of liquid and pouring out liquid) involved in experiment 
2.1/ assessed both by (1) Method -T (i.e. on-the-spot observation of the performance of 
the techniques) and (2) Method -P (i.e. checking the reported results/final product of 
the experiment). 



[:"•- .:.-■- 




Achievement in Assessment by Method - 


T 




Achievement in 


(■•• "" 
















assesmenthy 


f : '~ ■■■■■ 

LV- '• ■ 

L-... " 

L'" ■' "•• 
















Method -P 


^ Pupil No. 


Marks for measuring 






Marks for pouring 




Marks for 


1. 
[. - ■ ■ . 




volume 






liquid 




reported results 


f. . 


M 


M+P 


Total marks 


p 


M+P 


Total marks 




I' 






(6 max) 








(6 max) 


(6 max) 


!■ 1 





1 


1 


3 




3 


6 


4 


! " ' 2 





1. 


1 


3 




3 


6 


4 


r 3 '. 











3 




2 


5 


1 


'■' 4 


1 


3 


4 


3 




3 


6 


5 


5 











2 




2 


4 


5 


6 


1 





1 


3 




3 


6 


6 


7 











2 


> 


2 


4 


4 


8 


2 


2 


4 


2 




2 


4 


2 


9 


2 


2 


4 


2 




3 


5 


4 


10. 


2 


2 


4 


3 




3 


6 


5 


11 


3 


2 


5 


3 




3 


6 


5 


■::'u 12 





1 


1 


3 




3 


6 


6 


:V..::""i' : i3 


1 





1 


3 




3 


6 


2 


14 


1 


1 


2 


3 


l 


3 


6 


4 


15 


2 


2 


4 


3 




3 


6 


3 


:■■ 16 











3 




3 


6 





: 17 


1 





1 


3 




2 


5 


5 


18 





1 


1 


2 




3 


5 


3 


19 











3 




3 


6 


4 


20 











3 




3 


6 






M/P: Only volume measurement/pouring out skill was assessed (max. marks for 
each skill = 3) 

M_+ P/M + P: Both the skills were assessed, but only the underlined skill's mark is 
involved in the calculation. 
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Table 2.9 



EL 

m 

IL 



^fituation of total agreement ("1") /disagreement ("0") among -3 judges 
M, a panel during the assessment in session 1 . 

fltipil No. ■ M by panel A P by .panel B M + P_ by panel C 



m 



i 

2 
3 

6 

#T7 



10 

Pi 2' 

"13 

ftl4 
.-15 

Si" 17. 

^18' 

% 19 
20 

■ - 21 



22 

pJL23 
IS 



1 




1 




1 





1 
1 




1 



1 



1 
1 
1 
1 

1 
1 



1 
1 
1 
1 
1 



1 
1 
1 






1 


1 


1 








1 


1 








1 


1 





1 





1 





-0 


1 


1 





1 





1 





1 


1 


1 


1 








1 





1 





1 



*M : only "Measurement of volume" skill assessed 



r£. : only "Pouring out liquid" skill assessed 



,.H + P : Both the.- skills assessed 
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Table 2.10 



Situation of total agreement (" 1 ") /disagreement ("0") among 3 judges 
in a panel during the assessment in session 2 . 

pupil No. . .. M by panel B P by panel C M + . P by panel A 



1 i 

2 i 

3 1 l 

4 1 l 
5-1- 1 

6 1 l 

7 l 

8 l 

9 1 1 
10 . .1 i 

11 . ; ■ 1 

12 l 

13 l 

14 1 i 

15 1 

16 1 i 

17 1 1 

18 1 i 

19 1 i 

20 l ! 



M : only "Measurement of volume" skill assessed 



£ : only "Pouring out liquid" skill assessed 
M + P : Both the skills assessed 






1 


1 


1 


1 


1 





1 


1 








1 





1 





1 


1 


1 


1 


1 


1 


1 





1 





1 











1 





1 


i 


1 


l 


1 





1 


1 


1 



h: 
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Table 2.11 

•tuation of total agreement ("1") /disagreement ("0") among 3 judges 
. a panel during the assessment in session 3. 



pupil No'. M by panel C ■■■-■• j? by panel A M + ' P_ by panel B' 



1 

W ■ '; 2 

.3 



- 



- .4 , ° 

"5 - 1 

. :'6 

7 l' 

8. 

9 



.10 ■ 

11 1 

12 1 

13 

14 

15 

16 1 

17 

18 1 

19 1 

20 1 



1 





1 


1 





1 


1 


1 





1 


1 


1 





1 





1 


1 


1 





1 




















1 


1 





1 


1 





1 


1 





1 


1 


1 . 


1 


1 





1 


1 > 





1 


1 


1 


1 


1 


1 











1 


1 


1 


1 


1 


1 


1 



r. " 



Li M : only "Measurement of volume" skill assessed 
;"-:£, : only "Pouring out liquid" skill assessed 



' ' ' . 






;•;: H + P. : Both the skills assessed 
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Table 2.1.2 

pearson-r (Pearson product moment correlation coefficient) between the 
achievements of pupils after being assessed by observation of their performance 
(Method -T) and that after being assessed by checking the reported results of the 
experiment (Method -P) for two practical skills (a) measurement of volume of liquid 
and (b) pouring out liquid, in the experiment 2.1 

Pouring out 
liquid 



0.078 
-0.011 
0.111 
-0.044 
-0.183 
0.021 
-0.104 



No. 


of run of 




Measurement of volume 


the 


experiment 

1st 

2nd 

3rd 

4th 

5th 

6th 






of liquid 

0.159 
-0.160 
0.223 
-0.023 
0.100 
0.306 


Combined runs 


1-6, 




0.142 


i.e. i 


or the whole e 


xperimental work 





Results of all the samples, i.e. pupils participating in 3 sessions, were used in the 
; . 7 Pearson-r value calculation for each run of the experiment) 






£■■■■ , 



. 
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Fig. 2.1 Plots of achievement in assessment (on-the-spot 

observation) by Method -T for measuring volume by 
the pupils participating in 3 sessions against their 
achievement in assessment by Method -P (checking the 
reported results of the experiment) for carrying out 
the whole experiment 2.1. 



(The data are in Tables 2,6, 2.7 and 2, 
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Fig 2.2 Plots of achievement in assessment (on-the-spot 

observation) by Method -T for pouring out liquid by 
the pupils participating in .3 sessions against their 
achievement in assessment by Method -P (checking the 
--reported results of the experiment) for carrying out 
the whole experiment 2.1. 



(The data are in Tables 2.6, 2.7 and 2.8) 



CHAPTER 3 



ASSESSMENT OF SKILL RETENTION 







3.1 Aim 

In this section of the thesis, studies have been made about the ability of 
school pupils to retain and reproduce practical skills involved in scientific 
work which have been taught earlier. 

The assessment of a pupil's ability to retain and reproduce knowledge and 
skill taught earlier could be taken as a measure of success in teaching. 
Information gained thereby regarding each pupil's peformance could be 
an important feedback material for rebuilding the future teaching strategy. 



To carry out this investigation, a group of pupils has been taught how to 
H carry out certain scientific manipulative skills and these were tested after 

,j| some time to find out how much they could retain and reproduce. 

Comparison of this result with the result of assessment of an untaught 
K %, group (control group) would help us to exclude the effect of general 
knowledge/ability gained by pupil by simply grdwing up in a modern 
society and spending time in the intellectual environment of a school, 
and allow us to find out the pupil's ability to retain knowledge acquired 
during teaching sessions, and to reproduce it, assuming that the teaching 
was carried out at the highest possible standards. 

3.2 Development of the study materials — Preliminary runs 

(i) Method 

In this part of the work in which the retention ability of pupils was 
investigated, a trial run was carried out only for the testing session and 
not for the teaching session. A group of 5 pupils (14 years old) was invited 
to take part in this trial run. We were able to find out how many solid 
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samples each pupil could study in half an hour and how many runs of the 
second experiment (pH of the mixture of weak acid and weak alkali) were 
possible in half an hour •• •" ■ ■■■■-* 

In general the objective was to find out if the preparation of the work- 
|| sheet and arrangement of the materials was adequate for. the testing 
% session. 

Er (H) Discussion of the findings of the trial run 

No major (or significant) changes were required. The trial run was carried 
' but as planned and the same procedure was repeated for the testing 

?■;■;'' sessionsl . ' . ■ v :■•.;■.•■ I ■ 

3.3 Final run 

(i) Method > 

1 . . In this section of the study, work was carried out in two stages; the first : 

stage involves the teaching of practical skills and the second stage 
|| involves testing the pupils' ability to reproduce these skills after a gap of a 

if certain period of time. 

Four sets of fourth form pupils from a locai comprehensive school 
(Earlham School, Norwich) took part in this study. There were 
approximately 25 pupils (boys and girls of age range 14-15 years) in each 
% set. 

For each of the four teaching sessions, organised at the end of their 
summer term, the pupils in each set were randomly divided into 3 
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groups. Group I was asked to carry out an experiment on the reaction of 
dilute hydrochloric acid (HC1) with sodium thiosulphate solution 
(Na 2 S20g) [see work-sheet 3.1 in Appendix 2.1].' Pupils in Group II were 
given a series of solid samples in ignition tubes and asked to heat them on 
a Bunsen burner and record their observations [see work-sheet 3.2 in 
Appendix 2.1]. All the materials needed for the experiments were 
supplied at the desk before the start of the work. Group III pupils watched 
a video film in which none of the skills used in the above mentioned 
experiments were involved. 

For this experimental work, one large laboratory in Earlham School was 
used. The video film was shown in a small room adjacent to the 
laboratory. The laboratory floor was divided into 2 sections, one section 
contained the acid - thiosulphate solution experiment involving volume 
measurement by measuring cylinder, while the other section contained 
the heating of solids experiments. " ^ 

All the necessary materials were prepared and arranged at the UEA earlier. 
On the day before the teaching session, these materials were moved to the 
school and laid out on the bench. In between the sessions, the chemical 
bottles were refilled and new sample tubes for heating experiments were 
placed on the desk. Lists of chemicals and apparatus used in experiment 
3.1 and 3.2 are given in Appendix 2.2. 

Pupils in Group I and II, engaged in the experimental work, were 
supervised by four science teachers (two for each group) who spent about 
one hour teaching various manipulative skills involved in the 
experiment. 
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The Group I pupils were taught how to measure accurately a certain 
volume of liquid using a measuring cylinder. They were asked to ensure 
that the lower meniscus of the liquid touched the required mark on the 
cylinder. For heating solids by Group II pupils, the following four 
manipulative skills, viz. (a) the position of the test tube holder, (b) angle of 
the test tube, (c) zone of the flame used in heating, and (d) colour of the 
flame, were thought to be important and the pupils were given thorough 
lessons on them, so that they could perform these skills correctly. The 
pupils were told that (i) the test tube holder should be kept close to the top 

end of the tube, (ii) the test tube should be held at an angle of about 45° 

and not pointing towards any person, (iii) the sample should be in the hot 
zone of the flame during heating, and (v) a blue coloured flame should be 
used for heating. 

> 
After two months of summer holidays, at the beginning of a new term, 

each of these pupils was asked to take part in experimental work for an 

hour and was given two experiments different from the previous ones, 

but involving the above-mentioned skills. The two experiments were: (i) 

the mixing of weak acid and weak alkali in different volume ratios and to 

determine the pH value of the mixture by using an indicator solution (see 

work sheet 3.3 in Appendix 2.1), and (ii) the heating of solids in ignition 

tubes and identification of the nature (acid/neutral/ alkaline) of the gas 

given off (see work sheet 3.4 in Appendix 2.1). The first experiment 

provided the opportunity for assessing the volume measurement skill 

while the second experiment provided the opportunity for assessing the 4 

manipulative skills involved in heating solids in a tube, as has been 

mentioned earlier in the teaching session. 
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;;..... It was the testing-time, i.e. the time to find out how much a pupil could 

remember and reproduce the skills which were taught earlier, and also to 
£' look at the performance of the pupils who did not receive lessons on any 
skill, in comparison with that of the pupils who were taught. For this 
purpose each set of about 25 pupils comprising samples from each of the 3 
|rj; - groups (i.e. measurement of volume group, heating group and video 
, group/ control group) was assessed in each session lasting for about an 
- hour. 

For this work, a spacious laboratory in Earlham School was used. It was 
the same laboratory as was used for the teaching sessions about two 
months previously. 

The numbering of the work-stations, randomisation of the pupils, and 
their identification with number labels was carried out in the same way as 
has been described earlier for the previous experiment (experiment 2.1) in 
Chapter 2. 



Materials required for both the experiments were laid out on the desk the 
previous day and some of the used chemicals were replaced in between 
the sessions. Arandix 2.2 contains the lists of chemicals and apparatus 
used in experiment 3.3. and 3.4 There were three sessions to assess three 
sets of pupils. Although four sets of pupils were given lessons, it was 
possible to assess only the three sets as the judges had to leave the 
laboratory to attend to other business before the commencement of the 
[ ■ fourth session. However, for data analysis, it did not cause any difficulties 

as each set of pupils had almost an equal number of samples from the 
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three method groups (volume measurement, heating solid and control). 

Each session, lasting for about an hour, was divided into two halves; one 
for the experiment involving measurement of volume and the other for 
the heating of solids in test tubes. Eight experienced science teachers acted 
|; "1 as judges. In the first half of the session, each judge assessed the pupils' 
skill to measure volume of liquid using a measuring cylinder and in the 
second half, the assessment of the four skills involved in heating solids. 

At the beginning of each assessment session, each of the 8 judges was 
provided with an assessment sheet (see appendix 2.1) to record the marks 
for the assessment of the five manipulative skills, one skill in the first 
; experiment (experiment 3.3) and four skills in the second experiment 
(experiment 3.4) for each pupil. "1" mark was awarded for correct 
execution of each skill and a: "0' mark for failing to do so. As a result 
each pupil obtained a maximum of 8 marks in experiment 3.3. (see work- 
sheet 3.3 in Appendix 2.1) and 32 marks in experiment 3.4 (see work-sheet 
3.4 in Appendix 2.1) for correct execution of all the manipulative skills. 

On the day before the assessment sessions took place, the judges were 
briefed clearly about the nature of their tasks and, following a discussion, 
they agreed upon a standard similar to that maintained during the 
teaching sessions, regarding the correctness of the manipulative skills. As 
the judges were different from the teachers involved in the earlier 
teaching sessions, they had no knowledge of the grouping of the pupils 
and this helped them to assess the pupils in an unbiased manner (without 
knowing who had lessons on what skill). 
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3.3. (ii) Results and Discussion 

The achievement of pupils belonging to all the three groups in each of the 
three sessions in carrying out the practical skills involved in two 
experiments have been determined by collating the results of the 
assessment from 8 judges' mark-sheets. 

Appendix 2.3 contains the "raw-data", i.e. the judges' mark sheets for all 
three sessions of the assessment. 'V"and " %" in the mark-sheets 
indicate the correct and incorrect execution of the manipulative skills. 
Assigning mark "1" for "/" and mark "0" for "X", the total marks 
(achievement) obtained by each pupil from 8 judges were calculated and 
are given in Table 3.1 (for session -I), Table 3.2 (for session -2) and Table 3.3 
(for session -3). As each session comprised an almost equal number of 
pupils belonging to the 3 method groups (volume group, heating group 
and control group) and following the randomised division during the 
teaching period earlier, the results of all 3 sessions were used in the 
analysis to calculate an average achievement value of the pupils 
belonging to each method group and for carrying out 5 skills (one in 
experiment 3.3 and four in experiment 3.4). 



The mean values of achievment obtained from the analysis of the 
experimental data are presented in Table 3.4. 



In experiment 3.3, which involves the measurement of volume, pupils in 
group I who had lessons in this skill have scored significantly better than 
the other two groups, whose achievements are very similar to each other. 
In experiment 3.4, pupils who had lessons on heating solids, have 
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produced, on the whole, (i.e. considering all the skills involved) better 
results than the other two groups. As separate skills are involved in 
experiment 3.4, the results were further analysed to understand "more 
clearly the pupils' performance in each of the skills. For the two tasks 
concerned with the position of the test-tube holder and angle of the tube, 
there is no significant differer^ (considering the s.d) between the 
achievement of each group in carrying out these two skills. For the task 
concerned with the zone of the flame, each group, even the taught one, 
produced poor results suggesting that, teaching of the skill had no 
permanent effect on the pupil. 

However, in the task concerned with using the correct colour 6f; flame, all 
three. groups have very high mean scores. Thus it would have been 
extremely difficult to improve on their answers by this teaching. 

The results indicate that the pupils who were given lessons on a practical 
skill can perform this skill better than those who have not, even after two 
months holiday, during which they did not take part in any academic 
activity involving the practical skill. In experiment 3.3 the taught group is 
twice as good as the untaught group, while in experiment 3.4 the taught " 

group is slightly better than the untaught groups. 

i 

In order to understand more clearly the effect of the variables on the 
variability in score discussed above, further analysis of the pupils' 
achievement data was carried out. 

Several 3X3 AN OVA based on randomised groups were done. 
They were as follows: 
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i) In experiment (3.3):- 3 method group (A) X 
3 sets (B) 

ii) In experiment (3.4), 

skill (a):- 3 method group (A) X 3 sets (B) 
iii) In experiment (3.4), 

skill (b):- 3 method group (A) X 3 sets (B) 

(iv) In experiment (3.4) 

skill (c):- 3 method group (A) X 3 sets (B) 

(v) In experiment (3.4) 

skill (d):- 3 method group (A) X 3 sets (B) 

Information derived from the ANOVA is in Tables 3.5. 

It seems from the analysis of variable (F^ is significant at p/b.0001 level) 

that in experiment 3.3, the teaching of volume measurement has a 
significant effect on the final results discussed earlier. The significance of 
Fa values also indicates that in experiment 3.4, the teaching of two skills, 

viz. (a) position of the test tube holder, and (b) angle of the tube-tube, has a 

significant effect on the final results; whereas teaching of other two skills, 

viz. (c) zone of the flame, and (d) colour of the flame, has no significant 
effect on the final results. The statistical analysis of pupils' achievement 
data support the conclusion drawn earlier regarding the variability in 
score due to different conditions in the variable (i.e. method group: taught 
group vs. untaught group). On the basis of Fg or F^XB value, it can be 
suggested that the overall results gathered were not affected significantly 
due to the fact that the subjects (pupils) belonged to three sets and they 
were assessed in three separate sessions. 
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Table 3.1 . 

Total marks obtained by. each pupil' assessed.. by 8 judge's in Session 1 
for carrying out volume measurement in. experiment 3'. 3 and heating 
solids in experiment "3. 4. 






Pupil No. 


Experiment 

3.3 

max marks 
= 8 




Experiment' 3>4 






Position of 

test tube.holder 
max .marks 

= 8 


Angle of 
test tube 
max marks 
• = 8 


Zone of 
flame 
max marks 
= 8 


Colour of 
flame 

max marks 
= 8 


1 H 


3 ' 


6 


8 


3 


7 


7 


- 


- 


■ v.. — 


V ; ... 


-'-. 


3 


- 


- 


_ 


_ 


,L_ 


4 V 


7 


7 


6 


1 


5 


5 V 


8 


7-' 


8 


5 


8 


'■&■:■ ... . - 


- 


,;.• ■<.;-.- 


,,:■■ r.: T- ■.-; 


.' , '• ' '-. '■■'.'- '■ 


;';;':■ ■''.•*" 


7 c ■- 


2 


8 


6 


.3 , 


.8 


8 c • 


2 


5 


7 


1 


8 


9 c .. 


4 





8 


8" 


8 


10 v « 


8 


2 


4 


6 


8 


.11 H 


2 


7' 


■ '■' 8 ■> 





8 


12 v 


5 


3 


■ 5 


. , ■ 1 ■-', :■ . 


.-•■' '8 • - 


13 H 


4 


7 


■8 • 


8 


8 


14 v 


6 


5 


" 7 


5 


8 


15 H 


1 


7 


5 


5 


8 


16 H 


7 


7 


8 


3 


8 


17. c 


7 


8 


7/ 


6 


6 


18 


- 


- 


- 


- 


_ 


19 


- 


' 


- 


- 


- 


20 


- 


- 


- 


- 


- 


21 v 


8 


3 


3 


7 


7 


22 v 


3 


7 


4 


7 


6 


23 c 


7 


4 


8 


5 


8 


24 H 


. 


8 


7 


7 


8 


25 c 


5 


7 


7 


6 


7 



V/h/C in Column one indicate the group (volume measurement/heating 
solids/control) they belonged to during the teaching sessions earlier. 



Table 3.2 
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Total marks obtained by each pupil assessed by 8 judges in Session 2 
for carrying out volume measurement in experiment 3.3 and heating 
solids in experiment 3.4. 



i 
SI 



Pupil No. 


Experiment 
*3.3 

max marks 
= 8 




Experiment 3 . 4 




Position of 
test tube, holder 
max marks 
= 8 


Angle of 
test tube 

max marks 
= 8 


Zone of 
flame 

max marks 
= R 


Colour of 
flame 

max marks 

= fl 


1 H _ 


5 


7 


7 


4 





?. 


- 


- 


- 


- 


- 


3 c 


2 


2 


4 


3 


7 


4 c 


6 


3 


7 


1 


8 


5 H 


7 


7 


7 





8 


6 H 


• , 5 . 


8 


8 


3 


■ 8 • 


7 H 


1 


7 


7 





7 


8 c 


4 


5 


6 


2 


8 


9 


- 


— 


_ 


— 


— 


10 v 


7 


4 


7 


5 


8 


n'.y 


8 


7 


■> 
■ 1 


1 


8 


12 c 


2 


2 


3 


2 


8 


13. H 


3 


5 


7 


6 


8 


14 


- 


- 


_ 


*. 


— 


15 c 


1 


6 


8 


7 


8 


16 H 


4 


7 


8 


6 


8. 


17 v 


7 


7 


8 ; 


6 


8 


18 y 


8 


7 


8 


3 


8 . 


19 v 


7 


"7 


3 


1 





20 H 


1 


3 


6 








21 c 


5 


3 


2 


1 





22 c 


4 


2 


5 


5 


3 


23 v 


8 


8 


3 


5 


5 


24 v 


7 


8 


6 


6 


• 5 


25 


- 


- 


- 


- 


- 



V/h/c in Column one indicate the group (volume measurement/heating 
solids /control) they belonged to during the teaching sessions earlier. 
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Table 3.3 

Total marks obtained by each pupil assessed by 8 judges in Session 3 
for carrying out volume measurement iin experiment 3.3 and heating 
solids in experiment 3'. 4. 



i'pupil.No. 


Experiment 

max marks 
= 8 




Experiment 3 . 4 






Position of 
test tube holder 
max marks 
= 8 


Angle of 
test tube 

max marks 
= 8 


Zone of 
flame 

max marks 
= 8 


Colour of 

flame 
max marks 

= 8 


~1 H 





6 


5 


1 


7 


n, V 


7 


2 


2 





7 


3 v 


1 


3 


4 


1 


8 


% v 


6 


4 


2 


1 


6 


fi. c 


2 


2 





,5 


2 


6 c 


- 3 


6 .; 


,7. 


2 


8 ....... 


7 c 


3 


3 


•3 


5 


8 


.8 c 


i 


' -7 ■ 


6 


3 


8 L 


9 


- 


- 


- 


- 


- 


10 v 


5 


1 


4' 


1 


8 


11 c 


3 


7 


8 


> 2 


4 


12 c ■ 


3 


7 


.5 





8 


13 v 


5 


3 


4 


2 


8 


14- c 


2 


8 


1 


2 


8 


Is v 


3 


4 


1 


2 


8 


1'6 v 


4 


3 


2 





d 


17 H 


1 


7 


7 , 


2 


8 


:18 


- 


- 


- 


- 


- 


19 H 


5 


7 


6 


5 


4 


20 H 


5 


7 


8 


7 


7 


21 


- 


- 


- 


- 


- 


22 c 


'1 


7 


6 


5 


8 


23 H 


5 


8 


7 


8 


8 


24 


- 


- 


- 


- 


- 


25 H 


5 


7 


7 


8 


8 



V/h/C in Column one indicate the group (volume measurement/heating 
solids/control) they belonged to during the teaching sessions earlier. 
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Table 3.4 

Mean values of the achievement of different groups of pupils in carrying out several 
practical skills in two experiments 



Expt. Practical skill 
assessed 



Measurement of 
volume group 

x s.d. 



Heating of solids 
group 

x s.d. 



Control 
group 

x s.d 



33 Measurement of 

volume 



6.10 +1.98 



3.37 ±2.18 3.28 ±1.80 

> 



3.4 (a) Position of the 

test tube holder 



4.86 +2.21 



6.74 +1.12 4.86 +2.42 



(b) Angle of the 
test tube 



4.24 +2.22 



7.05 +0.94 5.43 +2.34 



(c) Zone of flame 



3.14 +2.42 



4.00 +2.87 3.52 +2.17 



(d) Colour of flame 6.90 ±1.90 



6.74 +2.49 6.71 +2.31 



(x = mean value, s.d. = standard deviation) 
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Table 3.5 
tables for 3X3 completely randomised ANOVA of the results of 

pils' ability to retain scientific manipulative skills 



" eriment 3.3 and 3.4. jAll the assessments were carried out by 



Rfethod -T) 



f/(a)' Experiment 




^ (3 method group) 
' : f(3 class/set) 



jAError 

ttfotal 



3.3 Measurement of volume using measuring cylinder 



SS 



df 



MS F F 
(obsv.) (crit) 



Significance of 
F (obsv.) 



98.26 


2 


23.60 


2 


24.85 


4 


160.50 


45 


307.21 


53 



49.13 

11.80 

6.21 

3.56 



13.80 8.25 at Significant at 

(p=0.001) p <^ 0.001 

3.31 3.23 at Significant at 

(p=0.05) p ^ 0.05 

1.74 2.61 at n.s. (p ^ 0.05) 
(p=0.05)- ' 



IB 



IbK'Experiment 3.4 Heating of solids (skill tested: Position of the 
iPrft,: - test tube holder) 



& Source of 



iavariance 



|At( 3 method group) 



E_B (3 sets) 



_ 

fcrror 
--Total 



SS 


df 


MS 


F > 
(obsv.) 


F 
(crit) 


Significance of 
F (obsv.) 


60.77 


2 


30.39 


10.16 


8.25 at 
(p=0.001) 


Significant at 
p <J 0.001 


4.11 


2 


2.06 


0.69 , 


3.23 at 
(p=0.05) 


n.s. (p ^ 0.05) 


53.78 


4 


13.45 


4.50 


3.83 at 
(p=0.01) 


Significant at 
p <[ 0.01 


134.67 


45 


2.99 


l- 


- 




253.33 


53 


- 


- 


- 


- 



pXc). Experiment 3.4 Heating of solids (skill tested: Angle of the test tube) 



IIP" 

teSource of 
gyariance 



SS 



df 



MS F F 
(obsv.) (crit) 



Significance of 
F (obsv.) 



;. A (3 method group) 
: B (3 sets) 
'■A'X B 

E rror 
T ot a i 



72.26 
26.93 
13.96 



2 
2 

4 



150.54 45 
277.65 53 



36.13 

13.46 

3.49 

3.35 



10.79 8.25 at Significant at 

(p=0.001) (p < 0.001) 

4.02 3.23 at Significant at 

(p=0.05) (p < 0.05) 

1.04 2.61 at n.s. at 

(p=0.05) (p < 0.05) 
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Table 3.5 
'"" Experiment 3.4 Heating of solids (skill tested: Zone of flame) 



f$pu 



|rce 



mi 



fiance 



SS~ - df MS F F Significance of 

(obsv.) (crit) F . (obsv.) 

at p=0.05 



^3 method group) 
(3 sets) 



14.93 

22.26 

37.07 

261.17 

335.43 



2 

2 

4 

45 

53 



7.46 

11.13 

9.27 

5.80 



1.29 3.23 

1.92 3.23 

1.60 2.61 



n.s. (p <^ 0.05) 
n.s. (p / 0.05) 
n.s. (p / 0.05) 






jfje) Expe 



riment 3.4 Heating of solids (skill tested: Colour of flame) 



Source of 

tfariance 



SS df MS F F 

(obsv.) (crit) 

at p=0.05 



fa (3 method group) 
IB (3 sets) 
aXb 

||rror 
trotal 



1.02 

12.26 

4.32 

218.33 

235.93 



2 
2 

4 
45 
53 



0.51 
6.13 
1.08 
4.85 



0.11 3.23 
1.26 3.23 
0.22 2.61 



Significance of 
F (obsv.) 



n.s. (p <^ 0.05) 
n.s. (p <( 0.05) 
n.s. (p ( 0.05) 



;lii] tables 3.5 (a) 
jffe(obsv:) 

Blrit) 



- (e) 



sfcs. 



denotes the value of F ratio obtained from the 
experimental results . 

denotes the critical value of F ratio at a given level 
of significance (i.e. p value) arid at the appropriate 
df values (see reference 50 for the relevant statistical 
tables) . 

means "not significant". This refers to the fact that the 
value of observed F ratio is less than the critical value of 
F at p=0.05 and thus F (observed) is not significant at 
p<^ 0.05 level. 

However, when F (observed) is higher than the F (critical) 
value at p=0.05, 0.01 or 0.001, the F (observed) is thought 
to be significant at the corresponding level of significance 
(p-value) . 
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CHAPTER 4 



ASSESSMENT OF SKILL TRANSFER 
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4.1 Aim 

In this final part of the thesis, investigations have been made not only 
about pupils' ability to retain the skills, but also his/her ability to transfer 
the skills to a situation different from that encountered during learning 
sessions conducted earlier. 

During the school period, a pupil's primary objective for learning a skill, 
retaining and reproducing it in an assessment session is often to gain a 
good grade. However, in real life the purpose of learning process is much 
more than passing an examination. After leaving school, a pupil would 
be expected to be involved in his/her working life, where he/she 
encounters problems different from those met in a school laboratory. In 
that situation me person would have to recall the skill learnt (or 
knowledge gained) and transfer it to a new situation to solve the recently 
encountered problem. Assessment of skill transfer ability of pupil is 
hence an important part of the practical examination process and this part 
of the thesis has been used to study it. 

For this purpose, a group of pupils has been taught certain manipulative 
skills. After a certain period of time, they were assessed for executing 



manipulative skills, which are similar but not the same, i.e. two different 

i 
skills having certain features in common. A control group (untaught 

^ group) have also been assessed to use their performance as a standard for 

comparing the results of taught group. 



4.2 Development of the study materials — preliminary runs 

(i) Method 

No trial run was carried out either for the teaching sessions or the testing 
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sessions used to assess the pupils' ability to retain the skills taught earlier 
and then to transfer -it into a new situation. However, experience gained 
during the work involved in chapters 2 and 3 was taken into 
consideration for organising the test session in this part of the 
investigation. In addition to this, great care was taken to make the work- 
sheets simple and easy to follow, and also to ensure an adequate supply of 
chemicals and apparatus. For teaching sessions, two experiments were 
chosen, one experiment involves measurement, of volume by using a 
measuring cylinder while the other experiment concerns heating solids in 
a test-tube over Bunsen burner. Two different experiments were chosen 
for .testing sessions. This time One experiment involves measuring, 
volume using a burette and the other involves heating liquid in a test- 
tube over a Bunsen burner '.■■■:■::.■■■: ; . ;. , «: . < 

4.3 Final run 

(i) Method } 

Two sets of third form pupils (13-14 years old), each comprising of about 24 
boys, from a local independent school (Norwich School, The Close, 
Norwich) took part in this investigation. At the beginning of the spring 



- " term they were given lessons by four experienced science teachers (two 



teachers per set) on a number of scientific skills involved in two 
; experiments: (i) Heat of reaction between acid and alkali (experiment 4.1), 

and (ii) Heating solid samples in test tubes (experiment 4.2) [see work- 
sheets 4.1 and 4.2 in Appendix 3.1]. 



For these teaching sessions, each set of pupils was randomly divided into 
three groups: Group I was given a lesson on the measurement of volume 
of acid/ alkali involved in experiment 4.1, Group II was given lessons on 



F" 
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heating solids in test-tubes, and Group III was kept as the Control group. 
The pupils- in Group III were shown a video film, unrelated to either two 
of the experiments mentioned above. 

For this teaching activity, two laboratories in Norwich School were used. 
One was equipped with materials needed for the volume measurement 
experiment, using a measuring cylinder and pupils belonging to Group I 
from both the sets were taught. The other laboratory was equipped with 
materials needed for the heating of solids in test tubes where the pupils 
belonging to Group II from both the sets were taught. There were two 
teachers in each laboratory and the pupils were given lessons on 
performing several manipulative skills accurately over a period of one 
hour. Chemicals and apparatus used in the teaching sessions (experiment 
4.1 and 4.2) are enlisted in Appendix 3.2. ^ - , - 



The pupils in Group I (measurement of volume group) were taught how 
to measure accurately a definite volume of liquid by using a measuring 
cylinder. It was stressed that the lower meniscus must touch the required 
volume mark on the cylinder. The pupils in Group II (heating group) 
were taught how to follow the necessary steps involved in heating a solid 
\~---~ sample in a test-tube. In heating solids, the pupils were expected to do the 

following:- (a) place the test-tube holder near the top of the tube, (b) hold 

i 
the tube at about 45° angle and not pointing towards any person, (c) place 

the end of the tube in the hot zone of the flame, and (d) use the blue 
flame. These four steps have been thought to be important in influencing 
the results of the thermal treatment of solids. 

After a period of about one and a half months, the pupils were tested. 
Now, all the pupils in each set were given experimental work and each 
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pupil had to do two experiments. The first experiment (experiment 4.3) 
involved the measurement of a certain volume of liquid (water and 
alcohol) using a burette and to weigh it in order to determine its density 
(see work-sheet 4.3 in Appendix 3.1). The second experiment (experiment 
4.4) dealt with the heating of a liquid in test-tube (see work-sheet 4.4. in 
Appendix 3.1). For this pupils were asked to heat a series of sugar 
solutions, mixed with Benedict's solution and identify whether they were 
reducing or non-reducing sugars. 

For this testing session two science laboratories (the same as those used in 
the teaching session) in Norwich School were used. One laboratory had 
facilities for carrying out the ; experiments involving measurement of 
volume by using a burette, while the other laboratory had facilities for the 
experiment involving heating solutions in test tubes using a Bunsen 
burner. 

> 

Two sets of pupils, each containing samples from Group I, II and III took 
part, and carried out both the experiments which lasted for about an hour. 
In the first half of the session one set did volume measured experiments, 
while the other set did heating experiments. Then in the second half of 
the session, the two sets swapped laboratories as well as experiments. The 
lists of chemicals and apparatus used in experiment 4.3 and 4.4 are given 
in Ap£ndix 3.2. 

The work-stations were numbered in the same way as described for the 
testing session in the previous chapters. The pupils were randomised and 
given number labels at the entrance of the laboratory at the beginning of 
the first part of the session, and this numbering remained valid for the 
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second half of the session, even though the whole set of pupils swapped 
laboratories after the first half of the session. 

The performance of the pupils in these two experiments, which took 
about an hour, was judged by observation (Method -T) by experienced 
teachers who had no knowledge of the teaching sessions. A panel of 3 
judges assessed the measurement of volume skill in experiment 4.3 for 
both the sets, and another panel of 3 judges assessed the heating skills in 
experiment 4.4 for both the sets. Before the start of the experiments, the 
pupils in each set were randomly allocated to various work-stations. 

The judges were given clear instructions regarding their assessment 
scheme. For the heating experiments (experiment 4.4), 4 important skills, 
mentioned earlier in the description of the teaching sessions, were 
checked. Each of the judges assessed all the pupils once (4 skills). As there 
were 3 judges, each pupil obtained a maximum of 3_.marks for performing 
each heating skill correctly. In other words, each pupil was assessed thrice 
altogether and obtained a maximum of 12 marks (3 x 4) for correct 
execution of all the 4 skills. During the measufe'Yn^'t of volume 
experiment (experiment 4.3), one judge (in the panel of three) was able to 
assess only 8 pupils (one third of the set) stationed at one bench, during 
the half hour taken by the set to complete all the experimental work on 5 
samples. In other words, each pupil was assessed only once and obtained 1 
mark for measuring accurately a certain volume of liquid (within an error 
.limit of + 0.1 ml) by using a burette. For this purpose, each of the three 
judges stayed near a set of burettes containing a series of liquids to be used 
by 8 pupils at a bench and noted the burette readings before and after a 
pupil drained out the liquid. There were three rows of burettes, attended 
by three judges, to be used by pupils from three benches. These burettes 
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were filled from time to time. 

In addition to the assessment by observation of the skill-performance 
(Method -T), the final results of the volume measurement experiment, 
which were the density values of a series of liquids, were also assessed. 
This provided a basis for assessing the skill-performance from another 
angle and is called Method -P. Experiments were carried out by me (5 runs 
per sample) to find out the exact density values for all the samples and 
these data (with the standard deviation as the error limit) were used as 
standards for assessing the results reported by the pupils, (see Appendix 
3.3). As there were 5 samples, each pupil was awarded a maximum of 5 
marks for reporting correct density data. 

4.3 Final Run 

(ii) Results and Discussion 

The information available in the judges' assessment sheets (see Appendix 

3.4 ) has been collated and given in Table 4.1 and 4.2. This provided the 

achievement data for pupils (assessed by Method -T) belonging to three 

groups in each of the two sessions in carrying out (1) measurement of 

volume using burettes involved in experiment 4.3 (see work-sheet 4.3 in 

appendix 3.1) and (2) heating solutions in test-tubes using Bunsen burner 

i 
in experiment 4.4 (see work-sheet 4.4 in appendix 3.1). The achievement 

of these pupils for carrying out volume measurement based on the 

assessment (by Method -P) of their final results (i.e. accuracy of the density 

data) of experiment 4.3 has also been determined. Finally these data in 

Tables 4.1 and 4.2 have been analysed for the mean values of achievement, 

together with standard deviation, of pupils belonging to each of the three 

groups and are given in Table 4.3. 




1 : 
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In experiment 4.3, the group who had lessons on volume measurement 
using a measuring cylinder have done better than the other two groups 
even though this time the volume measurement was done by using a 
burette. The other two groups seem to have done quite well, although 
they had no lessons on the skill of measurement of volume. 

The achievement data obtained by assessment Method fP (i.e. marking the 
reported density values) of experiment 4.3 also suggest that the group who 
had lessons on volume measurement skill performed better than the 
other two groups. However, each group's achievement is so high that one 
can conclude, that the teaching of volume measurement skill has probably 
no significant effect on the final results (i.e. density values) determined in 
this way. Pearson -r between the achievement when assessed by Method - 
T and that when assessed by Method -P for (i) volume measurement 
group> (ii) heating group, and (hi) control group^have been calculated and 
are given in Table 4.4. 

As Pearson -r value for each of these 3 groupsliferwithin +0.5 and -0.5, it is 
unlikely that there is a strong correlation between the two variables. 
However, it is worthwhile looking at the critical values of 'Y' given in the 
statistical table and finding out the statistical significance of the calulated 
Pearson -r values. All the 3 "f" values seem to be lower than the critical 
,"f'' values for df = 12, 12, 11 respectively even at p = 0.05 indicating that % 
probability of correlation by chance factor is higher than 5%. Thus the 
Null Hypothesis can be accepted at this level stating that the existing 
correlation is due to chance factor and there is no significant correlation 
between the two sets of data. However, although the experimental results 
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do not provide any support for the hypothesis that improvement in 
pupils' ability to measure liquid volume by burette accurately will make 
an improvement in the accuracy of the reported density data of this liquid, 
the hypothesis cannot be rejected completely. This is because the total 
number of samples in each group taking part in the experiment was 
relatively small compared with a Nation-wide situation having a large 
number of samples which may provide a more meaningful correlation 
supportive of the experimental hypothesis. Nevertheless/ the design of 
experiments used in this study was sound statistically, as each method 
group contained almost equal numbers of randomised samples of both 
sexes drawn from a pool of 40 pupils (2 sets) which could be considered as 
a representative of the pupils of the same 1 age group in the country. 
Figures 4.1, 4.2 and 4.3 show the plots, of the: achievement data while 
assessed by Method -T against that while assessed by Method -P for 
measuring volume using a burette by each of the three groups of pupils. 
No clear trend or pattern can be seen in any of these figures which 
supports the prediction from the Pearson-r value calculation that no 
significant correlation exists between the achievement data obtained from 
two different mode of assessment of the -volume measurement skill of 
any of the three groups. 



Results of experiment 4.4 indicate that the group which had lessons on 



heating solids in test tubes performed better in heating solutions in test 
:.":..;■ tubes than the other two groups. The heating group's better performance 
I is clearly visible in all the four skills involved in experiment 4.4. 
£_ However, except for the task concerned with the skill of position of the test 

_ tube holder, the mean value of achievement is considerably higher for all 

the three groups. It may be due to the fact that teaching had very little 
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impact on the performance of the heating group. 

Several 3X2 ANOVA based on randomised groups were done. They were 
as follows: 

i) In experiment 4.3 

Assessment ) 3 method group(A) 

of volume measurement ) X 2 sets (B) 

(ii) In experiment 4.3 

Assessment ) 3 method group(A) 

of the final results ) X2sets(B) 

(density data) 

(iii) In experiment 4.4 

7 
Skill (a):- 3 method groups (A) X 2 sets (B) 

(iv) In experiment 4.4. 

Skill (b):- 3 method groups (A) X 2 sets (B) 
(v) In experiment 4.4. 

Skill (c):- 3 method groups (A) X 2 sets (B) 

(vi) In experiment 4.4. 

Skill (d):- 3 method groups (A) X 2 sets (B) 

Information derived from the ANOVA is in Table 4.5. 

None of the F^, Fg or F^xB seem to be significant even at p <^0.05 level. 

From the analysis of variance, it appears that the teaching of a skill to a 
group had no significant effect in the achievement, we have found that 
for a few skills the taught group was slightly better than the other two 
groups and for the other skills all the three group's achievements were 
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very similar. However, this does not necessarily mean that higher 
achievement data for the taught group observed in the assessment of a 
few skills has no significance at all. The lack of statistical significance of 
Fa at p^ 0.05 level could be due to high standard deviation value of the 
data. Nevertheless, F^ could be meaningful at a level slightly higher than 
p = 0.05 and thus it is worthwhile to recognise the higher achievement 
(although small) of the taught group compared to the untaught group in 
extending the skill to a new situation.. The ANOVA also suggest that the 
results of assessment has not been significantly influenced by having the 
subjects in 2 sets and assessing them in 2 separate sessions. . , : . ' 

-The lack of correlation between the achievement data obtained irom two 
■methods of assessment raises questions regarding the assumption made 
earlier, that "such a comparison can be used as a measure, of validity of the 
method of assessment". Now one may ask if both the methods are 
invalid or either of the methods is invalid. If the latter case is correct, 
which one is invalid? 

To address these questions, I would suggest that perhaps one of these 
methods cannot be used to validate the other. The basic reason is, as it 
appears from the findings of this research work, that we are not 
measuring the same thing in two methods of assessment. We are 
measuring two different aspects of the same experiment. 

To clarify this statement I would state that in the Method -T of assessment, 
we are measuring a few specific psychomotor skills, but in Method -P we 
are assessing the outcome of the extended task which involves the effect 
of skills already assessed by Method -T plus some other skills which are 
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required to complete the whole experiment. So the picture produced by 
the achievement data obtained from Method -P does not give a mirror 
image of that obtained from the assessment Method -T. 

This suggests that we cannot substitute the assessment Method -T by 
Method -P, even though the latter method is more convenient for a 
teacher, as it can be carried out at a later stage, i.e. outside the busy lesson 
time. However, assessment by Method -T is essential as it allows us to 
observe in-situ how a pupil performs each manipulative skill. 
Assessment by Method -P, on the other hand, deals with the cumulative 
effect of a series of manipulative skills and cognitive skills which is 
submitted as the final result of the whole experiment. 

Thus ..one of these methods cannot be used to measure the validity of the 
other. Assessment of two different aspects of an experiment by two 
methods is likely to be the reason for not obtaining a significant 
correlation between the achievement data of pupils assessed by these two 
methods of assessment. 

To validate one method by another, it is vital to ensure that both the 

methods are measuring the same thing. For example, we can measure the 

i 
thickness of a piece of glass by using a ruler and we can find the validity of 

this method by measuring again the same thickness by a more 

sophisticated technique, viz. slide callipers or spherometer. 

In the work described in this thesis, the situation seems to be different. 
The two methods, T and P are not assessing the same aspects of an 
experiment: instead they are assessing two different aspects of the same 
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experiment. 

Let us consider the experiment 2.1 in Chapter 2. Here pupils measured the 
volume of liquid by a measuring cylinder, and poured it out into a flask 
containing Mg. These two skills are assessed by Method -T. One may 
expect that if these two skills are performed correctly, the time needed for 
Mg ribbon to dissolve in acid completely (basis of the assessment by 
Method *-P) would be correct too and pupils are expected to obtain full 
marks while assessed from two directions (Method -T and -P) providing a 
significant correlation between the achievement data obtained from two 
methods of assessment. But a careful analysis of the steps involved in the 
work assessed by two methods would reveal that the assessment by ; 
Method -P deals with a few more aspects of the pupils' ability Of 
performing practical skills in addition to those already assessed by Method 
-T. These extra skills will have influence on the results and consequently 
the value reported as the final result of the whole experiment is not 
representing exactly the same skills of the pupil as assessed by Method -T. 
Similar arguments can be put forward regarding the assessment of two 
different aspects of experiment 4.3 in Chapter 4 by two different methods. 
This explains why we are not obtaining a significant correlation between 
the pupils' achievement data obtained from two methods of assesment 
used in each of these two experiments. 

To validate either Method -T or Method -P of assessment, it would be 
necessary to design (formulate) another procedure which measures exactly 
the same thing. 

To illustrate this idea, I would say that in order to validate assessment by 
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Method -T of pupils' ability to measure volume of liquid by a measuring 
cylinder, one could repeat the experiment by using a burette and still assess 
the skill performance by Method -T. Similarly to validate the assessment 
of the final results of the whole experiment submitted by the pupil as a 
written report (i.e. Method -P), one could devise another method for 
completing the subsequent steps in experiment 2.1 (Chapter 2) or 4.3 
(Chapter 4) by some other sophisticated means. Thus two ways of carrying 
out the same aspects of an experiment. and both assessed by Method -P 
become worth comparing. 



Table 4 . 1 

Total marks obtained by the pupils in set 1 for carrying out volume 
measurement by burette in experiment 4.3 and heating liquids in test 
tube experiment 4 ..4 (*assessed by method -P) (see page :2<) a.Hci§0 • 
for explanation of max. marks) 



G5 



r 
■ 






l y. 



Pupil No. 


Experiment 4.3 
(session 2) 

max marks 
= 1 =5* 


Experime 


It 4.4 
1) 




in 
(Set 1) 


Position of 
test tube.holder 

max marks 
= 3 


Angle of 
test tube 
max marks 
= 3 


Zone of 
flame 
max marks 
= 3 


Colour of 

flame 
max marks 

= 3 


1 H 


. 1 3 


3 


3 


3 


3 


2 C . 


1 2 





3 


3 


3 " 


3 V 


1 3 


; 


i 


2' 


3 


4 H 


1 2 


3 


3 


3 


3 


5 


- 


- 


- 


- 


- 


6 v ..... 


1 3 





3 


3 


3 


•7 C 


2 





2 


3 


3 


8 v 


1 1 


1 


3 


2 


2 


9 v " 


3 


3 


3 


3 


2 


10 c 








1 


1 


2 


11. c . 


2 


3 


3 


2 


1 


12 •" 


- 


- 


- 


- 


- 


13- H ■ 


3 


3 


3 


3 


3 


14 H 


1 1 


3 


2 


3 


3 


'15 v 


4 


3 


2 





1 


16 c 


1 3 


1 


3 


2 


2 


17 v 


1 4 


3 


;3 


3 


3 


18- H 


4 


1 


3 


2 


3 


19. ~ 


- 


- 


- 


- 


- 


20 H 


1 4 


2 


3 


3 


3 


21 c 


1 1 


3 


3 


3 


3 


22 V 


3 


3 


3 


2 


3 


23 c 


4 


1 


3 





3 


24 " 


- 


- 


- 


- 


- 


25 H 


3 


3 


3 


3 


3 



V/H/C^column one indicate the group (volume measurement/heating 
solids/control) they belonged to during the teaching sessions earlier. 



Table 4.2 
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Total marks obtained by the pupils in set 2 for carrying out volume 
measurement by burette in experiment 4.3 and heating liquids in test 
tubes in experiment 4.4 (*assessed by method -P) (see page tzg O.W- &0 
for explanation of max- marks) 



Pupil No. 


Experiment 4 - 3 
(session 1) 

max mark 
1 5* 


Experiment 4.4 
(session 2) 




Position of 
test rube.holder 
max mark 

= 3 


Angle of 
test tube 
max mark 
= 3 


Zone of 

flame 
max mark 
= 3 


Colour of 

flame 
max mark: 
= 3 


1 c 


1 3 


2 


2 


3 


3 


7. V 


1 2 


3 


2 


3 


3 


3 v 


1. 4 ' 


3 


2 


2 


3 


4 v 


1 2 





3 


3 


3 


5 V 


1 3 


1 


3 


2 


3 


6 G 


1 4 


2 


2 


3 


- 2 


7 H 


1 1 


3 


3 


' 3 


3 


8 H 


1 3 


3 


3 


3 


• 3 


9 H 


- 1 2 





2 


3 


3 


10 v 


1 3 


3 


1 


3 


3 


11 c 


1 2 





' 2 


3 


3 


12 H 


1 2 


1 


3 


3 


3 


13 c 


1 2 


2 


3 


3 


3 


14 c 


4 


2 


3 


2 


3 


15 ~ 


~- - 


_ 


_ 






16 c 


4 





2 


3 


3 


17 H 


3 


3 


3 , 


3 


■>■ 


18 H •■ 


1 2 





2 


3 


3 


19 c 


1 


' 


2 


3 


3 


20 H ' 


1 3 


3 


3 





3 


21 H 


2 


1 


2 


2 


■>. 


22 v 


3 


3 


3 


3 


3 


23 v 


1 2 


3 


2 


.3 


3 


24 v 


1 4 





3 


1 


3 


25 ~ 


- 


- 


- 


__ 





V/h/c in column one indicate the group (volume measurement/heating 
solid/control) they belonged to during the teaching sessions earlier. 



Table 4.3 

jvlean values of the achievement of different groups of pupils in carrying out several 
practical skills in two experiments: 4.3 and 4.4. 



Expt. 


Practical skill 
assessed 


Measurement of 
volume gr. 


Heating of 
solids gr. 


Control 
Group 


Max. 
value of x 






x s.d. 


x s.d. 


x s.d. 




4-3 


Measurement 
of volume 


0.73 +0.44 


0.67 +0.47 


0.50 ±0.50 


1 



* 2.93 ±0.85 2.53 +0.88 2.43 +1.24 5 



4.4 (a) Position of 

the test tube 
holder 1.93 +1.34 2.13 +1.15 1.14 +1.12 



(b) Angle of the 

test rube 2.47 +0.72 2.73 +0.44 2.43 +0.62 3 



(c) Zone of flame 2.33 +0.87 2.66 +0.79 2.43 +0.90 

(d) Colour of flame 2.73 +0.57 2.87 +0.34 2.64 +0.61 



(x = mean value, s.d. = standard deviation) 

* Results of the skill assessed by Method -P(marking the final outcome of the 

^experiment, i.e. density data). 

[The rest of the data in the table are results of assessment by method -T (on-the-spot 
observation by the judges)] 
(See page39and$ofor explanation of maximum value of x). 



I. 



TABLE 4.4 



Pearson-r (Pearson product moment correlation coefficient) between the 
achievements of pupils after being assessed by Method -T and that after 
being assessed by Method -P for their volume measurement skill in 
experiment 4.3. 



T 

Volume measurement Heating Group Control Group 



-0.223 -0.373 0.133 
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Table 4.5 



Sum mary tables for 3X2 completely randomised ANOVA of the results of 

i 

studies on pupils' skill transfer abilities, (see experiments 4.3 

an< 3 4.4 in Chapter 4) . 



( a ) Experiment 4.3 Measurement of volume of some liquids by burette 

(assessed by the judges on the spot, i.e. Method -T) 



£.'■■"■ 'Source of 



variance 



SS df MS F F Significance of 

'(obsv.) (crit) F (obsv.) 
at p=0.05 



A (method group) ■ 


0.429 


2 


0.2145 


0.9003 


3.32 


B (class/set) 


0.595 


1 


0.595 


2.499 


4.17 


aX b - ,- 


. - 0.0479 


2 


0.024 


0.1005 


3.32 


Error 


8.571 


36 


0.238 


- 


- 


Total 


9.643 


41 


_ 




_ 



n.s. (p <^ 0.05) 
n.s. (p <^ 0.05) 
n.s. (p <^ 0.05) 



(b) Experiment 4.3 Measuring volume by burette (assessed on the basis of 

reported density data, i.e. Method -P) 



Source of 
::. "'variance 



SS df MS F F Significance of 

(obsv.) (crit) F (obsv.) 
; at p=0.05 



/■A (method group) 
;:. B (class/set) 
? A X B 
: Error 
- Total 



1.334 2 

0.00024 1 

3.999 2 

38.572 36 

43.905 41 



0.667 0.623 3.32 

0.00024 0.000224 4.17 

1.9995 1.867 3.32 
1.0714 



n.s. (p <^ 0.05) 
n.s. (p < 0.05) 
n.s. (p <^ 0.05) 



/continued 



Table 4.5 
(c ) Experiment 4.4 Heating of solutions in test tube. 
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(combined four skills, assessed by Method -T) 



Source of 
variance 


S-S 


df 


MS 


F 
(obsv. ) 


F 
(crit) 
at p=0 


05 


Significance of 
F (obsv.) 


A (method group) 


28.04 


2 


14.02 


2.82 


3.32 




n.s. (p</ 0.05) 


B (class/set) 


0.85 


1 


0.425 


0.17 


4.17 




n.s. (p <^ 0.05) 


aX b 


13.01 


2 


6.505 


1.31 


3.32 




n.s. (p< 0.05) 


Error 


179.00 


36 


4.972 


- 


- 




- 


Total 


220.90 


41 


— 


- 


- 




- 


(d) Experiment 4 . 4 


Heating of 


solutions in test tube, 









skill tested: Position of the test tube holder 
(assessed by Method -T) 



Source of 




• ss' 


df 


MS 


F 


F 




Significance of 


variance 










(obsv.) 


(crit) 
at p=0 


05 


F (obsv.) 


A (method 


group) 


9.48 


2 


4.74 


3.02 


3.32 




n.s. 


(p < 0.05) 


B (class/set) 


0.10 


1 


0.10 


0.064 


4.17 




n.s. 


(p<( 0.05) 


A X B 




2.33 


2 


1.165 


0.74 


3.32 




n.s. 


(p < 0.05) 


Error 




56.57 


36 


1.571 


- 


- 






- 


Total 




68.48 


41 


— 


- 


- 






- 



7 (e) Experiment 4.4 Heating of solutions in test tube, 

skill tested: Angle of the test tube 
(assessed by Method -T) 



■v Sourer of 
Variance 






SS 


df 


MS 


F 
(obsv.) 


F 
(crit) 
at p=0 


.05 


Significance of 
F (obsv.) 


^ ..(method 


group) 


1.186 


2 


0.593 


1.467 


3.32 




n.s. (p <( 0.05) 


~ B (class/set) 




0.59 


1 


0.59 


1.460 


4.17 




n.s. (p <^ 0.05) 


; a x b 

J -Error 






0.06 
14.564 


2 
36 


0.03 
0.405 


0.074 


3.32 




n.s. (p<[ 0.05) 


; Total 






16.4 


41 


- 


- 


- 




- 



/continued 



sL Source of 
^ V ari ance 


ss 


df 


MS 


F 
(obsv.) 


F 
(crit) 
at p=0 


.05 


Significance of 
F (obsv.) 


W h (method group) 


0.76 


2 


0.38 


0.54 


3.32 




n.s. (p</ 0.05) 


i" b (class/set) 


-0.47 
4.47 


1 
2 


-0.47 
2.235 


- -0.66 
3.14 


4.17 
3.32 




n.s. (p ^ 0.05) 
n.s. (p <^ 0.05) 


1 Error 


25.72 


36 


0.714 


- 


- 




- 


|i"Tptal 


30.48 


41 


> 


— 


— 




- 
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Table 4.5 
,/£)■ Experiment 4.4 Heating of solutions in test tube* 

skill tested: Zone of flame 
(assessed- by Method -T) 



■7 (g) Experiment 4.4 Heating of solutions in test tube, 

skill tested: Colour of flame 
(assessed by Method -T) 



f r Source of 
variance 



SS df MS F F 

(obsv.) (crit) 

at p=0.05 



F (obsv.) 



A: (method group) 
B (class/set) 

|X-B 

Error 

l^ptal 



0.62 


2 


0.86 


1 


1.00 


2 


9.14 


36 


11.62 


41 



0.31 
0.86 
0.50 
0.254 



1.24 3.32 

3.44 4.17 

2.00 3.32 



n.s. (p <( 0.05) 
n.s. (p ^ 0.05) 
n.s. (p <( 0.05) 



^Tables 4,5 (a) 
K|t(obsv.) 

ri F _Jcrit) 

p'0's.' 



- (g): 



denotes the value of F ratio obtained from the experimental 
results. 

denotes the critical value of F ratio at a given level of 
significance (i.e. p value) and at the appropriate df 
values (see reference 50 for the relevant statistical tables) 

means "not significant". This refers to the fact that the 
value of observed F ratio is .less than the critical value 
of F at p=0.05 and thus F (observed) is not significant at 
p <^ 0.05 level. 
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-1 



O 



1.0 

Achievement - in assessment by Method -T (max. mark = 1) 



Fig. 4.1 Plots of achievement data obtained from assessment by 
Method -T for measuring volume using burette by pupils 
of the "Volume measurement" group (those who got lessons 
on measuring volume by a measuring cylinder) 
participating in both- the sessions against their' 
achievement daja obtained from assessment by Method -P 
for carrying/ the whole experiment 4.3. 
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O 



© 



1.0 

-Achievement in -assessment by Method -T (max. mark 



1.0) 






Fig. 4.2 Plots of the achievement data obtained from assessment 

by Method -T for measuring volume using burette by pupils 
of the "heating solids" group (those who got lessons on 
heating solids) participating in both the sessions against 
their achievement data obtained from assessment by 
Method -P for carrying . out the whole ' experiment 4.3. 
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Fig. 4.3 Plots of the achievement data obtained from assessment 
by Method -T for measuring volume using burette by 
pupils of the "Control" group (those who did not get 
any lessons) participating ■ in both the sessions against 
their achievement data obtained from assessment by 
Method. -P for carrying out the whole experiment 4.3. 



CHAPTER 5 



CONCLUSIONS AND RECOMMENDATIONS FOR FUTURE WORK 

> 
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The findings of the work described in this thesis (Chapters 2, 3 and 4) lead 
to a number of conclusions which have been summarised here. 

The results of the first stage of the work suggest that the assessment has 
been highly reliable, even when the teachers assessed two skills at the 
same time, instead of only one. However, further study may be carried 
out to determine the maximum number of skills that one teacher can 
assess in one session for a set of approximately 20 pupils and still maintain 
high standards of reliability. ^ 

Although the achievement in pouring out a liquid skill is very good, for 
the volume measurement skill, it is only about 33% of the total marks. 
This is perhaps due to the fact that the skill of volume measurement is a 
more technically oriented task involving the precision of reading a finely 
graduated scale. Furthermore the relatively poor performance in volume 
measurement reflects how poor some pupil's ability is in carrying out 

22 
some manipulative skills as has been mentioned earlier by Bryce et.al. . 

No matter how simple a skill may be, it must be taught with great care so 
that pupils of even low ability range, who need extra help, can acquire the 
skill. 

Lack of significant correlation (shown by Pearson-r) between the 
achievement data obtained from assessment by Method -T and that from 
assessment by Method -P suggest that marking of the reported results can 
not be used as a substitute for the assessment of manipulative skill during 
the practical session. The lack of statistical significance does not 
necessarily mean that there is no educational significance of collecting 
achievement data from different ways of assessment of pupils carrying out 
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a skill in the laboratory. Some pupils can express their skill and abilities 
best either in preparing written report or by performing in the laboratory, 
but not in both ways. So, at least for these pupils it would be fair to assess 
them in both ways, instead of considering one method of assessment as a 
substitute of others, no matter how convenient and time saving it could 
be for the teacher. The low Pearson-r value may prevent the substitution 
of Method -T by Method -P of assessment, but it may encourage the use of 
both methods of assessment as a complimentary to each other. 

> 
Agreement among the judges (another measure of the reliability of 

assessment) is almost 100% for the pouring out liquid skill, while it is 
about 50% for the volume measurement skill. This is probably due to the 
fact that the pouring skill is simpler than, the ; volume measurement skill, 
as has been seen from the pupils' achievement in performing these skills. 
In measuring a certain volume of liquid using a measuring cylinder, the 
pupil needs to perform more precision work as he/she is expected to bring 
the lower meniscus of the liquid column at the correct mark on the scale. 
Regarding the judges' tasks, as they have to follow a more complex check- 
list for the volume measurement skill assessment, there is room for a 
variation in their opinion and hence lack of agreement. 

Further research using a check-list of various degrees of complexity for 
different experiments can be carried out to ascertain whether the reliability 
of assessment changes, and if so, to what extent. 

The results of the second stage of the research show that the taught groups 
can produce better results than the untaught groups in carrying out 
manipulative skills. From this, it may be suggested that the pupils can be 
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expected to remember a skill taught even 8 weeks earlier. However, this 
ability to retain and -reproduce a scientific skill is likely to vary with its 
nature and the complexity involved. For example, the achievement of 
the taught group in performing the volume measurement skill is double 
that of the untaught groups, whereas in the heating skill, the taught group 
has done slightly better than the untaught groups (overall score being very 
high for all the groups). This difference in the standard of achievement by 
the taught group in 2 skills is probably due to the fact that the 
pupils' learning ability depends on the degree of complexity of the skills 
and their retention ability depends on how this knowledge is anchored in 

the pupils' minds and assimilated with his/her existing ideas '. 

The results of the studies of pupils' ability to transfer skills to a new 
situation (3rd stage of the research) shows that the taught group can 
produce better results than the untaught groups. As the achievement of 
the untaught groups is also very high, one may suggest that the cause for 
this is that the pupils participating in the assessed practical test are of very 
high academic ability and have talents to perform a new skill without any 
formal lesson on that (or any similar) skill. 

However the higher achievement of the taught groups indicates that 
teaching does indeed help pupils to improve their knowledge, some of 
which is retained and enables him/her to tackle a similar problem more 
efficiently. In future, work can be carried out using pupils of a wide ability 
range and asking them to execute a series of skills with varying degrees of 
complexity and difficulty. 

Pearson-r values showed no significant correlation between the 
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achievement data obtained from two modes of assessment (Method -T 
and Method -P) for- the volume measurement skill using a burette in the 
third stage of this investigation. Lack of such relationship has also been 
found, mentioned earlier, in the first stage of this investigation. So it 
would not be fair to assess the skills involved in an experiment only on 
the basis of the final written report/result, unless it is clearly established 
which skill (or skills) has predominant influence on the final outcome of 
the experiment. This could be due to the fact that the outcome of the 
extended task/pupils' written report is not strongly affected by the skills 
assessed in the laboratory. One needs to recognise that, apart from the 
skills assessed, there are a number of skills involved. in the experiment 
which have not been assessed and could have influenced the final result 
supplied in the written report. So all possible skills/ steps involved in the 
experiment have to be isolated and further studies of this type have to be 
carried out. If a strong correlation exists between the pupils' achievement 
data gathered from two sources, (a) assessment by Method -T (on the spot 
observation of pupils' performance of the manipulative skills /technique) 
and Method -P (written report of the whole task), then these skills could 
be assessed by Method -P instead of Method -T and the assessment will be 
valid for these specific skills only. The results of this study support the 

view, suggested earlier by Bryce and Robertson , that it is more important 

and meaningful to assess manipulative skills by observation on the spot 
during experimental work, rather than depending solely on the written 
report containing answers to some questions or the recording of some 
observations, although the latter method has been advocated by 

Woolnough and Toh . In their published work, Woolnough and Toh 30 

gave detailed information about how a number of reporting methods had 
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been investigated and concluded that a "broadly cued" method of 
reporting by the pupil proved to be the best in reflecting pupils' laboratory 
performance which can be assessed by the teacher at a later stage. 
Although they have reported a strong correlation (Pearson -r) between 
pupils' achievement assessed by teacher observing them directly in the 
laboratory, and that while assessed on the basis of their written report, they 
have not given information regarding the methodology (organisational 
aspect and a check list) used for on-the-spot assessment. In this case the 
strong correlation between two sets of achievement data would have been 
meaningful and the case for replacing one method of assessment by the 
other would have been strong, if they had provided information about the 
nature of the skill assessed by direct observation, and also the check-list 
used. 



The findings of this research are likely to have some implications on 
further research in this field, and also on classroom activities involving 
teacher and pupil. One could extend this type of research work for other 
age groups to gain a more generalised picture of the conclusion drawn 
here. Practical work carried out by lower age group pupils can be 
-■'- considered as pre-GCSE practice and preparation for the assessed practicals, 
forming part of the GCSE examination which will be faced by the pupils at 
the age of 15+. So, it would be beneficial for both teachers and pupils alike 
to try to become involved in assessment (perhaps of the formative type) of 
practical work from an early stage of secondary school education. As 
assessed practical is an integral part of the GCSE examination, practice and 
preparation in this aspect from an early age will undoubtedly help the 
pupils to do well in the final examination. Furthermore pupils tend to 
work with more application if it is known to them that the practical work 
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is being assessed. Such a practice will help the teachers also. They will use 
this experience to improve their teaching technique, develop a more 
effective long-term teaching plan, and produce a reliable assessment 
scheme. In the end a national assessment scheme may be developed so 
that a parity in the marking standard is maintained throughout the 
schools and thus the need for rigorous moderation is reduced. 

Further research will help to establish the confidence in the reliability of 
teachers' assessments made by direct observation of pupils' performance 
' in-situ. For this purpose, one needs to find out (1) if the reliability of 
assessment varies from teacher to teacher or from school to school, (2) if it 
varies as a teacher increases the items on the check-list (skills to be . 
assessed) in a session, and (3) the maximum number of skills one teacher 
can assess in a session without compromising the reliability. 

Reduction in the number of pupils per session/set to maintain/improve 
the reliability of assessment is not a realistic approach because a teacher, 
being in charge of a set, will be expected to assess all his pupils by himself 
in a session. Even if he does not assess all pupils, but decides to assess 
only a small section in every lesson for one skill, he will have to look after 
them and make sure that the un-assessed group spends their lesson-time 
in useful academic activities without disturbing the assessed group. 

Further research in this field would be needed to establish the reliability 
and validity of assessment of practical work on the basis of the outcome of 
the extended task or the final product. 

In this thesis, comparison between the performance of boys and that of 
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girls has not been considered, as an innumerable number of studies has 
already been done previously in this area. I feel that it may be a more 
realistic idea not to give too much importance on the gender difference of 
the pupils, and treat them equally for teaching or assessment purposes. 

More research of the type described in this thesis will help to develop a 
suitable check-list for a well defined mark-scheme of all GCSE practicals to 
maintain reliability of assessment between various schools, again helping 
the moderators' task. Without a proper check-list and a well defined mark 
allocation scheme, teachers may adopt the method of subjective 
judgement (impression grading) and this may lead to erratic/irrational 
marking based solely on the opinion of the assessor which may vary 
markedly. Therefore, well defined check-lists for assessment of 
manipulative skills in science practicals will improve the reliability of 

assessment (see Eglen and Kempa °). It is also necessary to establish the 

total number of skills a teacher could be expected to assess single handedly 
in a class of 20-25 pupils in one practical session (say, a double period) 
without lowering the standard of assessment (i.e. compromising the 
reliability of assessment). 

Further research on pupils' skill retention and transfer ability can be 
carried out to obtain more information regarding the cognitive aspect of 
practical work and make some advancement in the understanding of 
pupils' learning process during a practically orientated teaching session. 

The teaching and learning of the practical skills involved in science are 
laboratory based work. Assessment based only on written work (written 
report of pupils' findings from the practical work) will neglect the 
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manipulative aspects of the practical work which become visible only 
during the performance of the task itself in the laboratory. 

Having carried out further research, one would be able to produce a check- 
list of manipulative skills for an experiment which are difficult to 
perform, subject to improvement by good teaching and thus capable of 
influencing the final results. If it is established by the researcher that there 
is a strong correlation between the pupils' achievement data obtained by 
assessment method -T (assessment of the "techniques", i.e. performance of 
key manipulative skill on the spot in the laboratory) and that obtained by 
assessment method -P (assessment of the "product" i.e. outcome of the 
whole task submitted as a written report), the latter method can be used as 
a substitute of the former and teachers can assess pupils' practical skill 
without the constraints of the classroom. However, until such a situation 
is reached, it would be a wise policy to assess pupils from both the angles. 
The results obtained from two modes of assessment would be 
complementary, providing a more complete picture of the pupil's 
achievement profile concerning their ability to carry out the practical work 
involved in school science courses. 



> 
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